[Za-pm] Working with Berkeley DB - a few questions

Jan Henkins jan at henkins.za.net
Mon Jun 20 08:30:22 PDT 2005

Hello people,

Only recently I started to muck about with Perl in a real programming
sense, other than writing the odd script to help me with my sysadmin
stuff on various Linux/Unix/Win32 machines and servers. So, working with
databases is a new thing to me. The power and elegance of the DBI
methods is astounding! You can imagine me giggling like a schoolgirl
after I realised that I can update a CSV file (never mind a proper MySQL
or PostgreSQL table) with SQL statements. The DBI tools helped me to
build quite a nice system in terms of storing various configuration
items in a database, be it a proper SQL engine or a common CSV file.

Now, my next foray into databases is DBM, specifically Berkeley DB.
Reason being that it is used as a config database for an Internet Cafe
system (OpenKiosk, see http://openkiosk.sourceforge.net/), and I would
like to pull data from these files in order to unify reporting. It's so
different from SQL databases that I actually have a few problems getting
my flat head around the "key=value" concept used in DBM files. 

OK, enough background noise, here is what I've got. I've managed to
start getting some data out of a set of DBM-formatted files. What threw
me for a loop is that they are formatted as Btree, not Hash or any other
way. So, I managed to tap together the following script after wading
through a lot if new, weird (for me at least) information in the
BerkeleyDB module (just do a "perldoc BerkeleyDB" once you've installed
the module to see what I mean):

# Filename: bdbmfile.pl
use BerkeleyDB;
use Fcntl;

die "Usage\n$0 dbmfile.pl\n" if (@ARGV < 1);

my $filename = $ARGV [0];
print "Dumping $filename:\n\n";

my %h;
tie %h, 'BerkeleyDB::Btree',
        -Filename       => $filename,
        -Flags          => DB_RDONLY
or die "Cannot open $filename $!\n";

foreach (keys %h)
                print "$_\n"

untie %h;

Rightyho, there aren't any original thinking behind this script, it's a
conglomeration of cryptic clues and almost useless examples I bumped
into in my "quest for info". Amazing how bad the docs are for Berkeley
DB in comparison with other DB methods (like Gnu DB). 

On to my question:

The above script only yanks out and display one set of key values. I
have no proper idea (other than horseing around in the CPP source code
of the Icafe application) of what keys are being used, or how many
different keys are used. How is it possible to leech all different keys
out of the db file and then to do a simple, formatted dump of all the
"key=value" pairs without knowing the internal structure of the db file?
At least I know that it's Btree formatted by using the "file" command on
it like this:

jan at slashbat dataset $ file access.db
access.db: Berkeley DB (Btree, version 9, native byte-order)

I attach the db file (called access.db, a small 8k file) to this mail,
hoping that it will stay attached to the body. If not, I can mail it
directly to whomever might be interested, or make it available via
FTP/HTTP somewhere. If you do manage to get this file with this mail and
run the above script against it (with the BerkeleyDB module installed of
course), you should see output like this:

jan at slashbat dataset $ ./bdbmfile.pl access.db
Dumping access.db:


Each of these numbers are session keys used by OpenKiosk to identify a
client session. Now, attached to each session key are a whole host of
other info like (1) has it been used, (2) for how long is the session
valid, (3) how much does it cost, (4) which operator sold this session
to a client, etc. etc.

Any help, thoughts, ideas and even illuminating flames would be very
much appreciated! :-)

Jan Henkins
-------------- next part --------------
A non-text attachment was scrubbed...
Name: access.db
Type: application/octet-stream
Size: 8192 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/za-pm/attachments/20050620/b373037e/access.obj

More information about the Za-pm mailing list