[Pdx-pm] searching for multiple strings
Wil Cooley
wcooley at nakedape.cc
Mon May 16 14:10:18 PDT 2005
Also Sprach Michael Rasmussen <mikeraz at patch.com> on Mon, May 16, 2005 at 01:16:44PM PDT:
> I'm working on a little thing to watch log files for patterns and then
> modify firewall rules based on items found.
>
> During benchmarking I came across something that said my internal Perl
> interpreter does not know what's going on.
>
> These don't seem to be synonyms:
> if( (/Illegal user/ || /User unk/ || /no such user/) ) {
> if(/Illegal user|User unk|no such user/) {
>
> Not only are they not synonymous, the performance difference is huge:
>
> [root at tire log]# ./tben maillog.1
> 42613 lines to process
> Benchmark: timing 10 iterations of allinone, seps...
> allinone: 71 wallclock secs (68.61 usr + 0.30 sys = 68.91 CPU) @ 0.15/s (n=10)
> seps: 3 wallclock secs ( 2.85 usr + 0.00 sys = 2.85 CPU) @ 3.51/s (n=10)
> found seps 36980 allinone 36980
>
>
> Um, what's going on here?
The first method performs 3 regular expression matches for every line,
whereas the second only one. That great of a timing difference is hard
to account for though. Performance differences aside, how are they not
synonymous?
> Secondary question, both methods look wrong to me. As in there has
> to be a better way to do the search. Especially when I'll eventually
> have N substrings to search for, some of them pulled from a config
> file specified by the user.
You want something that works like 'grep -f <patternfile>'? It isn't
difficult to slurp in a file, chomp each line, append to a string,
append '|' (if not the last), then use the resulting string as the RE.
You probably also want to compile the RE with 'qr//'.
Wil
--
Wil Cooley wcooley at nakedape.cc
Naked Ape Consulting http://nakedape.cc
* * * * Linux, UNIX, Networking and Security Solutions * * * *
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://mail.pm.org/pipermail/pdx-pm-list/attachments/20050516/251fbc04/attachment.bin
More information about the Pdx-pm-list
mailing list