SPUG: filtering

Charles Mauch cmauch at gmail.com
Mon Aug 1 09:01:59 PDT 2005

This probably isn't a perl specific question, but I've been trying to
figure out how to solve a (probably simple) filtering task.

I have a bunch of data, harvested from incoming emails, and I'd like to
periodically clean it up.  The format is pretty simple, with three fields
tab delimited.

user at domain.com <tab> Real Name <tab> Date Inserted

Obviously, I get a bunch of duplicate data.  I've been able to come up some
perl code to sort that data by the second field, and even delete duplicate
data where the date is the same.  But I'd like to perform a filter where if
(email & name) equal any other lines (email and name), then drop all other
records which match except for the most current (denoted by the date).

I don't need anybody to code this up, I'm just kind of at a loss where to
start.  Just looking for ideas on ways to do this.

And no, this isn't for a spam program or anything.  It's for my lbdb
database, which mutt queries to autocomplete when composing emails. :)

Take it easy

Charles Mauch, [cmauch at gmail.com], Big Time Glue Eater
Every message PGP or S/MIME signed to verify authenticity.
"There are no great men,  only great challenges that ordinary
men are forced by circumstances to meet."
                                   --- Admiral William Hasley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/spug-list/attachments/20050801/9b3886fd/attachment-0001.bin

More information about the spug-list mailing list