Script Question

David Harris dharris at drh.net
Thu Aug 26 15:53:41 CDT 1999


Craig Freter wrote:
> David,
>
> Because a DNS zone file contains different kinds of records (e.g. A,
> CNAME, MX, NS, SOA), you may want to break up the regular expression.  A
> separate regular expression for each record type would eliminate much of
> the backtracking, since you are now matching more specific data.
>
> My 2 cents worth.

I don't think splitting regex into multiple ones for each kind of record would
help, because each record is still allowed to specify or not specify the name,
the address class, and the time to live. Each record is also allowed to use the
( .. ) syntax to span newlines. This causes all the back tracking.

I guess replacing "([a-zA-Z]+)" with something like "(a|cname|mx|ns|soa)" could
help reduce the backtracking.

My solution was just to get rid of all the junk to deal with the ( .. ) line
continuation, and just made it so if a record had multiline data, I got the
first line of data, and the rest of the lines were not parsed. I didn't care
about the actual data, so this worked for me.

However, I'm more interested to find out why the regex caused huge amounts of
backtracking.

 - David Harris
   Principal Engineer, DRH Internet Services





More information about the Baltimore-pm mailing list