[Chicago-talk] writing log files to a DB

Steven Lembark lembark at wrkhors.com
Sun Mar 27 21:47:23 PST 2005


> @ARGV = "log_files";
> while (<>) {
>
> my ($src, $dsc, $clid, $chn, $PU, $HU, $timestamp,
> $dur) = split(",",$_);
>
> my $q = "REPLACE INTO master_log VALUES
> (?,?,?,?,?,?,?,?)";
> my $sth = $dbh->prepare($q);
> $sth->execute($src, $dsc, $clid, $chn, $PU, $HU,
> $timestamp, $dur);
> }
>
> This seems inefficient because it reinserts the whole
> file each time it's run and the file keeps getting
> larger. So I was going to change so that it first gets

Use a decent keyspace in the database, one that includes
a timestamp, and the duplicate records will be rejected.

Given that this is a single-user application, you don't
have to be [too] paranoid about logic races. In that
case you can just "select max(timestamp) from tablename"
and only commit records newer than the timestamp - 1 sec.

The simplest fix is to insert the stuff into a scratch
table that is emptied prior to each use. You then insert
into the permenant table where temp table's timestamp
is >= max(timestamp) from the real table. That leaves
your DBI running quickly and the database dealing with
the messiest part in bulk (vs. record-by-record).

If the logging can have a sequence number then use that
instead of a timestamp.

If you're doing much more of this I'd strongly suggest
C J. Date, _An Introduction to Database Systems_ -- last
I saw it was up to the 7th edition, though for your stuff
anything from the 5th on up will do nicely.


-- 
Steven Lembark                                       85-09 90th Street
Workhorse Computing                                Woodhaven, NY 11421
lembark at wrkhors.com                                     1 888 359 3508


More information about the Chicago-talk mailing list