[Wellington-pm] Larry on Regexes

Sam Vilain sam at vilain.net
Thu Oct 14 16:02:57 CDT 2004


Of course, there's always (?s:.) to enable it inside the regex.

Here are the timing comparisons (length(string)==44), not much in it:

Benchmark: timing 10000000 iterations of m/^(?s:.)*$/, m/^.*$/s, 
m/^[\0-\377]*$/...
m/^(?s:.)*$/:  4 wallclock secs ( 5.35 usr + -0.02 sys =  5.33 CPU) @ 
1876172.61/s (n=10000000)
   m/^.*$/s:  6 wallclock secs ( 6.06 usr +  0.01 sys =  6.07 CPU) @ 
1647446.46/s (n=10000000)
m/^[\0-\377]*$/:  9 wallclock secs ( 8.62 usr +  0.00 sys =  8.62 CPU) @ 
1160092.81/s (n=10000000)

Who can tell me what this one does, then?  :-)

$paren = qr{\( (?: (?> [^()]+ )
	    |   (??{ $paren })
	   )* \)}x;

$_ = $input;

while ( s{^\(((?:[^(]+|$paren)*)\s+and\s((?:[^(]+|$paren)*)\)$}{$1}is
	    or s{^((?:[^(]+|$paren)*)\s+and\s((?:[^(]+|$paren)*)$}{$1}is
	) {
		push @x, $2;
}

push @x, $_

Philip Abrahamson wrote:
> If you use the 's' flag following the pattern match it will let '.' 
> match a newline - as well as everything else.
> 
> Philip Abrahamson
> 
> Dave Moskovitz wrote:
> 
>> That would have been in the days before Unicode!
>>
>> On Wed, 13 Oct 2004 21:47, Grant McLean wrote:
>>
>>> Because . doesn't match \n.  [\0-\377] is the most efficient way
>>> to match everything currently.  Maybe \e should match everything.
>>> And \E would of course match nothing.  :-)
>>>            --Larry Wall in <9847 at jpl-devvax.JPL.NASA.GOV>
>>>
>>
> _______________________________________________
> Wellington-pm mailing list
> Wellington-pm at mail.pm.org
> http://www.pm.org/mailman/listinfo/wellington-pm


-- 
Sam Vilain, sam /\T vilain |><>T net, PGP key ID: 0x05B52F13
(include my PGP key ID in personal replies to avoid spam filtering)


More information about the Wellington-pm mailing list