[oak perl] Comparing two files
Chris Yager
iceman at prado.com
Tue May 31 00:55:00 PDT 2005
Mike,
In my tests the most efficient way to determine
unique from duplicate lines was with a Perl hash.
Enclosed please find:
"dups.pl" does the work.
"dupslib.pm" puts content in your scalers &
consumes $differences.
dups.pl uses a group of lines to fill the hash with $longfile,
it then uses another group of lines to search for unique lines
in $shortfile.
The intent is to make it clear what is going on.
dups.pl has 2 lines commented out at the end.
They do the same thing as the 2 groups of earlier lines,
but are more obscure.
Did you enjoy the deluge of responses?
You wrote:
>my $shortfile;
>my $longfile;
>my $differences;
>
>
>I'm writing a script to compare two text files ($shortfile & $longfile).
>If a line appears in $shortfile, but that line is not in $longfile, then
>I want to write that line out to $differences
>
>I'm relatively certain it is not efficient to open $longfile for each
>entry in $shortfile. Both files are of the magnitude of 800+ lines.
>
>For example, a given line in $shortfile is found at line 333 in
>$longfile. Without closing and reopening $longfile, I don't know how to
>reset the 'pointer' in $longfile back to line 1.
>
>Perhaps there is a better way of doing this. I hope I've explained what
>I'm trying to do clearly.
>
>Suggestions ?
>
>Thanks,
>Mike
Chris Yager
(510)317-5900
iceman at prado.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dups.pl
Type: application/octet-stream
Size: 1406 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/oakland/attachments/20050531/fc3010d8/dups.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dupslib.pm
Type: application/octet-stream
Size: 710 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/oakland/attachments/20050531/fc3010d8/dupslib.obj
More information about the Oakland
mailing list