[Melbourne-pm] An old one but an important one
Tim Connors
tconnors at astro.swin.edu.au
Mon Jan 21 00:29:14 PST 2008
On Mon, 21 Jan 2008, Jacinta Richardson wrote:
> Tim Connors wrote:
>
> > But if you can't determine where $& is going to come from, why can you
> > determine where $1 is going to come from (given that $& is essentially
> > $0), such that you don't need to slow down all other regexps too?
>
> Perl can't determine where $1 and friends is coming from at any given point in
> your program (without running it). Thus any regular expression which uses
> capturing parentheses sets $1 and friends (and does so even if Perl can't find
> you using $1 etc anywhere) (and does so even if you capture into variables or
> not). Of course this happens when you're merely using the parentheses for
> grouping because Perl doesn't know any better.
>
> For example consider this code:
>
> $_ = "happy happy joy joy";
> if(/(h.*y) joy/) {
> if(/(joy)/) {
> }
> print "Matched: $1\n"; # Which $1 is this?
> }
>
> what if we wrote:
>
> $_ = "happy happy joy joy";
> if(/(h.*y) joy/) {
> if(/(pig)/) {
> }
> print "Matched: $1\n"; # Which $1 is this?
> }
>
>
> Perl knows that it has to record $1 for each expression that it successfully
> evaluated, because we're using capturing parentheses. It will try to clear $1
> on an unsuccessful evaluation, but you shouldn't count on it (on my Perl 5.8.8
> the second snippet of code prints "Matched: happy happy joy").
>
> Perl doesn't care which expression actually set $1. If we'd used $& and/or it's
> friends it'd be the same story. Perl would set it for each expression evaluated
> because it cannot tell which expression you needed it from.
>
> Perl can easily tell whether your regular expression is matching and capturing,
> or just matching. Thus using non-capturing parentheses instead of capturing
> will speed up those expressions, while not using them will not slow down any others.
Sorry - I meant if $& slows down *all* regexps used in the program,
whether they use parentheses or not, but $1 doesn't cause all regexps to
slow down. The only slowed down regexps are those that use any
parentheses.
I guess it's because $& should be set always, because it doesn't need
capturing parentheses to match. So if it is ever referred to, then it
must always be created.
--
Tim Connors
More information about the Melbourne-pm
mailing list