[Chicago-talk] recoverable (human-readable) persistent data stores?
Jonathan Rockway
jon at jrock.us
Sun Jul 9 01:08:20 PDT 2006
Does anyone have any suggestions for something database-like that stores
itself in a human readable form?
I'm looking to replace YAML files (in my backup software, Chroniton*)
that contain tens of thousands of entries like this:
> /home/jon/tmp/.pm:
> - !!perl/hash:Chroniton::File
> location: /tmp/backups/backup_1152419747.22405
> metadata:
> atime: 1152420555
> attributes:
> user.testattribute: foo
> user.creation_time: 1152339615
> ctime: 1152339615
> gid: jon
> md5: c04b397efc6df812d0668d48b631e93b
> mtime: 1152339615
> permissions: -rw-r--r--
> size: 105
> uid: jon
> name: /home/jon/tmp/.pm
> type: file
>
> /home/jon/tmp/foo:
> - etc.
with something that I can load into memory incrementally, and then store
back to disk incrementally (i.e., I only need one record in core at a
time, but while it's in core it gets read and written).
I'd like to avoid a sqlite or berkeley database file, because if the
file gets corrupted somehow, all the data tends to get lost. (Ever move
a bdb svn repository between machines? It just doesn't work.) I've
also been burned a number of times with sqlite shared library updates
losing my data. Since the point of backup software is to be able to
restore your machine when you hose it, I can't be dependent on having
version 1.3.3.7_42 of some shared library around.
The other obvious option, using an individual file for each record, is
both cumbersome and inefficient -- on my filesystem each file takes 4k
(and I've configured systems where each file is 32M at a minimum!).
For the 54631 files in my ~/tmp directory (not really temporary files,
btw) this would use 213M of disk at the very minimum. That's 10%
overhead, and isn't acceptable :) (The compressed YAML only takes up 1.9M!)
BTW, reading in the whole file and delete-ing hash keys frees up memory
according to Devel::Size, but the perl process' memory footprint never
shrinks.
With these restrictions in place, I'm kind of out of ideas, so any
insight would be greatly appreciated. Thanks!
Regards,
Jonathan Rockway
* GPLd and available from CPAN or http://www.jrock.us/trac/chroniton
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 370 bytes
Desc: OpenPGP digital signature
Url : http://mail.pm.org/pipermail/chicago-talk/attachments/20060709/a0ffc253/attachment.bin
More information about the Chicago-talk
mailing list