[Pdx-pm] string comparison vs hash

Eric Wilhelm scratchcomputing at gmail.com
Tue May 29 15:41:08 PDT 2007

# from Thomas J Keller
# on Tuesday 29 May 2007 01:52 pm:

>         my $names_join = join '|', @names;
>         my @goi;
>         if ($fh1->open("< $annot_file")) {
>             my @lines = <$fh1>;
>             foreach (@lines) {
>                 chomp;
>                push @goi, $_ if $_ =~ m/($names_join)/;

Hashes rock, but this regexp juggling might not be quite fair.  A more 
apt comparison would be either:

  1.  split the input and grep({$_ eq $name} @names)
  2.  $names_join = join("|", map({'\Q' . $_ . '\E'} @names));
      $names_join = qr/$names_join/;

(Untested, and there might be a tidier way to express that, see perlre.)

With your current code having $names_join as a string, the regular 
expression is being computed each time inside the foreach.  There 
should be a similar thread in the archive where Randall Hansen was 
working on a grep benchmark "oh, gross grep() re-evaluates the regex."

[...proprietary software is better than gpl because...] "There is value
in having somebody you can write checks to, and they fix bugs."
--Mike McNamara (president of a commercial software company)

More information about the Pdx-pm-list mailing list