[Pdx-pm] searching for multiple strings

Michael Rasmussen mikeraz at patch.com
Mon May 16 13:16:44 PDT 2005


I'm working on a little thing to watch log files for patterns and then
modify firewall rules based on items found.

During benchmarking I came across something that said my internal Perl
interpreter does not know what's going on.

These don't seem to be synonyms:
  if( (/Illegal user/ || /User unk/ || /no such user/) ) {
  if(/Illegal user|User unk|no such user/) {

Not only are they not synonymous, the performance difference is huge:

  [root at tire log]# ./tben maillog.1
  42613 lines to process
  Benchmark: timing 10 iterations of allinone, seps...
    allinone: 71 wallclock secs (68.61 usr +  0.30 sys = 68.91 CPU) @  0.15/s (n=10)
    seps:      3 wallclock secs ( 2.85 usr +  0.00 sys =  2.85 CPU) @  3.51/s (n=10)
  found seps 36980  allinone 36980


Um,  what's going on here?

Secondary question, both methods look wrong to me.  As in there has to be a better way
to do the search.  Especially when I'll eventually have N substrings to search for, some
of them pulled from a config file specified by the user.

tben is:
#!/usr/bin/perl

use Benchmark;
# I use maillog.1
@logfile = <>;
$allinone = $seps = 0;
print "$#logfile lines to process\n";
timethese ( 10, {

  # print "long $cnt\n";,  # make sure my matches match,
  seps => q{
    for(@logfile) {
      if( (/Illegal user/ || /User unk/ || /no such user/) ) {
        $seps++;
      }
    }
  },
  allinone => q{
    for(@logfile) {
      if(/Illegal user|User unk|no such user/) {
        $allinone++;
      }
    }
  }
});

print "found seps $seps  allinone $allinone\n";

-- 
    Michael Rasmussen, Portland Oregon  
  Be appropriate && Follow your curiosity
 http://meme.patch.com/memes/BicycleRiding
   Get Fixed:  http://www.dampfixie.org
  The fortune cookie says:
Early to bed and early to rise and you'll be groggy when everyone else is
wide awake.



More information about the Pdx-pm-list mailing list