Script Question

David Harris dharris at drh.net
Thu Aug 26 14:26:24 CDT 1999


I've done this a lot in perl.. using a single large regular expression.
However, sometimes I think that I am pushing it because more than one time I
have created a regular expression that does so much backtracking that it chews
up 20 seconds of CPU before I kill it.

For example, I got into lots of trouble with this regular expression, which was
designed to parse zone files. You read the whole zone file into $stuff and loop
grabbing records off the top with this regex and removing blank links and
comments with others.

                       $stuff =~
                       s/
                               ^
                               \s*
                               (?:(\S+)\s+)?
                               (?:\d+\s+)?
                               (?:IN\s+)?
                               (?:([a-zA-Z]+)\s+)
                               (
                               (?:
                                       [^\(\n]+ (?: \; .* )?
                                       |
                                       \(
                                       (?:
                                               [^\)\n\;]*
                                               (?: \; .* )? \n
                                               |
                                               [^\)\n\;]+
                                       )*?
                                       \)
                               )+?
                               )
                               \n
                       //x

Anybody know anything about this?

 - David Harris
   Principal Engineer, DRH Internet Services


-----Original Message-----
From:	owner-baltimore-pm-list at happyfunball.pm.org
[mailto:owner-baltimore-pm-list at happyfunball.pm.org] On Behalf Of Craig Freter
Sent:	Thursday, August 26, 1999 3:19 PM
To:	James W. Sandoz; (BIO;FAC)
Cc:	baltimore-pm-list at happyfunball.pm.org
Subject:	Re: Script Question

 << File: test2.pl >> James,

I modified your student parser script.  I used a different approach, in
that I try to match the entire student line with a single regular
expression.  I don't know if that makes the script more 'perlish', but
you might find my approach interesting.





More information about the Baltimore-pm mailing list