[Melbourne-pm] An old one but an important one

Tim Connors tconnors at astro.swin.edu.au
Mon Jan 21 00:29:14 PST 2008


On Mon, 21 Jan 2008, Jacinta Richardson wrote:

> Tim Connors wrote:
> 
> > But if you can't determine where $& is going to come from, why can you 
> > determine where $1 is going to come from (given that $& is essentially 
> > $0), such that you don't need to slow down all other regexps too?
> 
> Perl can't determine where $1 and friends is coming from at any given point in
> your program (without running it).  Thus any regular expression which uses
> capturing parentheses sets $1 and friends (and does so even if Perl can't find
> you using $1 etc anywhere) (and does so even if you capture into variables or
> not).  Of course this happens when you're merely using the parentheses for
> grouping because Perl doesn't know any better.
> 
> For example consider this code:
> 
> 	$_ = "happy happy joy joy";
> 	if(/(h.*y) joy/) {
> 		if(/(joy)/) {
> 		}
> 		print "Matched: $1\n";   # Which $1 is this?
> 	}
> 
> what if we wrote:
> 
> 	$_ = "happy happy joy joy";
>         if(/(h.*y) joy/) {
>                 if(/(pig)/)  {
>                 }
>                 print "Matched: $1\n";   # Which $1 is this?
>         }
> 
> 
> Perl knows that it has to record $1 for each expression that it successfully
> evaluated, because we're using capturing parentheses.  It will try to clear $1
> on an unsuccessful evaluation, but you shouldn't count on it (on my Perl 5.8.8
> the second snippet of code prints "Matched: happy happy joy").
> 
> Perl doesn't care which expression actually set $1.  If we'd used $& and/or it's
> friends it'd be the same story.  Perl would set it for each expression evaluated
> because it cannot tell which expression you needed it from.
> 
> Perl can easily tell whether your regular expression is matching and capturing,
> or just matching.  Thus using non-capturing parentheses instead of capturing
> will speed up those expressions, while not using them will not slow down any others.

Sorry - I meant if $& slows down *all* regexps used in the program, 
whether they use parentheses or not, but $1 doesn't cause all regexps to 
slow down.  The only slowed down regexps are those that use any 
parentheses.

I guess it's because $& should be set always, because it doesn't need 
capturing parentheses to match.  So if it is ever referred to, then it 
must always be created.

-- 
Tim Connors



More information about the Melbourne-pm mailing list