[Wellington-pm] Today's tricky perl teaser

Jacinta Richardson jarich at perltraining.com.au
Mon Nov 27 23:30:27 PST 2006


Cliff Pratt wrote:
> I now know why, but can anyone spot why the regex will not compile on 
> the date/time line? (Please ignore the inappropriate line breaks).

/x is special, but when Perl sees a regular expression it first looks for the
end of the expression before it starts interpretting comments (because it has to
get to the end in order to discover you're using /x mode).

Your date time line says:

    \s+(\[.*\])             # Date / time

                                   ^

Perl sees the slash in your comment and decides that that's the end of the
regular expression.  Since it doesn't make sense to divide the result of your
regular expression by the call to time() (which gets some very strange
arguments), Perl realises something has gone wrong and complains.

Change your regular expression to use m{ } instead and this will all get better:

   my @fields = m{
          (^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) # Host IP address
          ...
   };

There are a number of ways you can simplify this regular expression and make it
more efficient.  I've shown a counter example below:

	my @files = m{
		([^S]+)		# Host IP (stuff which isn't a space char)
		\s+
		([^S]+)		# Remote logname
		\s+
		([^S]+)		# Remote user
		\s+
		\[		# start of date
		(.{26})		# Date / time (fixed length field)
		\]		# Alternately: [^\]]+ stuff which isn't ]
		\s+
		("[^"]")	# Request
		\s+
		(\d{3})		# Status code
		\s+
		(\d+|-)		# Bytes transferred
		\s+
		("[^"]")	# Referrer (from request header)
		\s+
		("[^"]")	# User-Agent (from request header)
		\s+
		\s+(.+)        # Host (from request header)
	};

(un-tested, but probably close).

This has many advantages because it does not require any back-tracking.  Further
it doesn't require all the messing around that .*? requires which is kind of
like a car trip with kids:

try to match next pattern                           (when will we get there?)
match current character, try to match next pattern, (are we there yet?)
match current character, try to match next pattern, (are we there yet?)
match current character, try to match next pattern, (are we there yet?)
match current character, try to match next pattern, (are we there yet?)
...
match next pattern.

All the best,

	J

-- 
   ("`-''-/").___..--''"`-._          |  Jacinta Richardson         |
    `6_ 6  )   `-.  (     ).`-.__.`)  |  Perl Training Australia    |
    (_Y_.)'  ._   )  `._ `. ``-..-'   |      +61 3 9354 6001        |
  _..`--'_..-_/  /--'_.' ,'           | contact at perltraining.com.au |
 (il),-''  (li),'  ((!.-'             |   www.perltraining.com.au   |


More information about the Wellington-pm mailing list