Script Question

Craig Freter cfreter at digex.net
Thu Aug 26 15:40:35 CDT 1999


David,

Because a DNS zone file contains different kinds of records (e.g. A,
CNAME, MX, NS, SOA), you may want to break up the regular expression.  A
separate regular expression for each record type would eliminate much of
the backtracking, since you are now matching more specific data.

My 2 cents worth.

> I've done this a lot in perl.. using a single large regular expression.
> However, sometimes I think that I am pushing it because more than one time I
> have created a regular expression that does so much backtracking that it chews
> up 20 seconds of CPU before I kill it.
> 
> For example, I got into lots of trouble with this regular expression, which was
> designed to parse zone files. You read the whole zone file into $stuff and loop
> grabbing records off the top with this regex and removing blank links and
> comments with others.
> 
>                        $stuff =~
>                        s/
>                                ^
>                                \s*
>                                (?:(\S+)\s+)?
>                                (?:\d+\s+)?
>                                (?:IN\s+)?
>                                (?:([a-zA-Z]+)\s+)
>                                (
>                                (?:
>                                        [^\(\n]+ (?: \; .* )?
>                                        |
>                                        \(
>                                        (?:
>                                                [^\)\n\;]*
>                                                (?: \; .* )? \n
>                                                |
>                                                [^\)\n\;]+
>                                        )*?
>                                        \)
>                                )+?
>                                )
>                                \n
>                        //x
> 
> Anybody know anything about this?
> 
>  - David Harris
>    Principal Engineer, DRH Internet Services
> 
> -----Original Message-----
> From:   owner-baltimore-pm-list at happyfunball.pm.org
> [mailto:owner-baltimore-pm-list at happyfunball.pm.org] On Behalf Of Craig Freter
> Sent:   Thursday, August 26, 1999 3:19 PM
> To:     James W. Sandoz; (BIO;FAC)
> Cc:     baltimore-pm-list at happyfunball.pm.org
> Subject:        Re: Script Question
> 
>  << File: test2.pl >> James,
> 
> I modified your student parser script.  I used a different approach, in
> that I try to match the entire student line with a single regular
> expression.  I don't know if that makes the script more 'perlish', but
> you might find my approach interesting.

-- 
All that is complex is not useful,
and all that is useful is simple.
                       -- Mikhail Kalashnikov



More information about the Baltimore-pm mailing list