[Melbourne-pm] An old one but an important one

Tim Connors tconnors at astro.swin.edu.au
Mon Jan 21 00:29:14 PST 2008

On Mon, 21 Jan 2008, Jacinta Richardson wrote:

> Tim Connors wrote:
> > But if you can't determine where $& is going to come from, why can you 
> > determine where $1 is going to come from (given that $& is essentially 
> > $0), such that you don't need to slow down all other regexps too?
> Perl can't determine where $1 and friends is coming from at any given point in
> your program (without running it).  Thus any regular expression which uses
> capturing parentheses sets $1 and friends (and does so even if Perl can't find
> you using $1 etc anywhere) (and does so even if you capture into variables or
> not).  Of course this happens when you're merely using the parentheses for
> grouping because Perl doesn't know any better.
> For example consider this code:
> 	$_ = "happy happy joy joy";
> 	if(/(h.*y) joy/) {
> 		if(/(joy)/) {
> 		}
> 		print "Matched: $1\n";   # Which $1 is this?
> 	}
> what if we wrote:
> 	$_ = "happy happy joy joy";
>         if(/(h.*y) joy/) {
>                 if(/(pig)/)  {
>                 }
>                 print "Matched: $1\n";   # Which $1 is this?
>         }
> Perl knows that it has to record $1 for each expression that it successfully
> evaluated, because we're using capturing parentheses.  It will try to clear $1
> on an unsuccessful evaluation, but you shouldn't count on it (on my Perl 5.8.8
> the second snippet of code prints "Matched: happy happy joy").
> Perl doesn't care which expression actually set $1.  If we'd used $& and/or it's
> friends it'd be the same story.  Perl would set it for each expression evaluated
> because it cannot tell which expression you needed it from.
> Perl can easily tell whether your regular expression is matching and capturing,
> or just matching.  Thus using non-capturing parentheses instead of capturing
> will speed up those expressions, while not using them will not slow down any others.

Sorry - I meant if $& slows down *all* regexps used in the program, 
whether they use parentheses or not, but $1 doesn't cause all regexps to 
slow down.  The only slowed down regexps are those that use any 

I guess it's because $& should be set always, because it doesn't need 
capturing parentheses to match.  So if it is ever referred to, then it 
must always be created.

Tim Connors

More information about the Melbourne-pm mailing list