[Pdx-pm] string comparison vs hash
Thomas Keller
kellert at ohsu.edu
Tue May 29 21:51:08 PDT 2007
I thought that I got around that problem by using three different
file handles, one for each of the three compare subroutines. But it
seemed worth testing. I commented out everything but the $fh->open()
statement; then I added the file read method; and finally the process
the lines method. Here are the numbers:
$ perl benchmark_hash_vs_grep
▼ ❑ A. Perl Benchmark.pm examples
• ❑ 1. open 3 filehandles sequentially 
Rate with_hash with_string_cmp with_grep
with_hash 33998/s -- -1% -2%
with_string_cmp 34296/s 1% -- -1%
with_grep 34600/s 2% 1% --
• ❑ 2. open and read: slurp into an array (@lines = <$fh>) vs
while (<$fh>) { } 
Rate with_string_cmp with_grep with_hash
with_string_cmp 6140/s -- -1% -39%
with_grep 6178/s 1% -- -39%
with_hash 10049/s 64% 63% --
• ❑ 3. open, read, and process lines 
Rate with_grep with_string_cmp with_hash
with_grep 169/s -- -87% -90%
with_string_cmp 1297/s 667% -- -25%
with_hash 1723/s 918% 33% --
1. Using separate fh's seems to have avoided the problem of advantage
due to order (cache vs fresh read).
2. The while (<$fh>) { do nothing } (the 'with_hash' approach) beats
the slurp into an array read method, used by the other two, quite
handily.
3. The hash method continues to kick hash vs the string compare
method, and the grep method is not even close.
Thanks for your help Eric and chromatic. This was a really useful
(and fun) exercise for a perennial beginner like myself.
regards,
Tom K
On May 29, 2007, at 3:54 PM, chromatic wrote:
> On Tuesday 29 May 2007 15:35:07 Austin Schutz wrote:
>
>> You are using different file reading techniques. That could be
>> _very_ significant. If you are going to slurp all the lines for the
>> string comparison you should do the same for the hash.
>
> Worse than that, the first code to read the file pays the penalty of
> populating file buffers. Subsequent reads probably all come from a
> warm
> cache.
>
> Mixing IO with benchmarks usually skews the results heavily.
>
> -- c
> _______________________________________________
> Pdx-pm-list mailing list
> Pdx-pm-list at pm.org
> http://mail.pm.org/mailman/listinfo/pdx-pm-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.pm.org/pipermail/pdx-pm-list/attachments/20070529/776b9712/attachment-0004.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: benchmark_filehandles
Type: application/octet-stream
Size: 3738 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20070529/776b9712/attachment-0003.obj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.pm.org/pipermail/pdx-pm-list/attachments/20070529/776b9712/attachment-0005.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: benchmark_read_methods
Type: application/octet-stream
Size: 3735 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20070529/776b9712/attachment-0004.obj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.pm.org/pipermail/pdx-pm-list/attachments/20070529/776b9712/attachment-0006.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: benchmark_hash_vs_grep
Type: application/octet-stream
Size: 3493 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20070529/776b9712/attachment-0005.obj
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.pm.org/pipermail/pdx-pm-list/attachments/20070529/776b9712/attachment-0007.html
More information about the Pdx-pm-list
mailing list