[Pdx-pm] regular expression

Tkil tkil at scrye.com
Sat Mar 13 23:09:29 CST 2004


>>>>> "John" == John Sechrest <sechrest at peak.org> writes:

John> So regular expressions are greedy. So you have to provide more
John> constraints.

Note that Perl has exactly what is needed to turn a greedy quantifier
into a non-greedy one: suffix the quantifier with a '?' character.

In this case:

| $ cat in.txt
| keyword NeedThis_1(blah, bloh); keyword NeedThis_2(blah, bloh,foo, bar);

Original expression:

| $ perl -lnwe 'while ( /keyword (.*)\(/g ) { print $1 }' in.txt
| NeedThis_1(blah, bloh); keyword NeedThis_2

Non-greedy version (note ".*?" instead of ".*"):

| $ perl -lnwe 'while ( /keyword (.*?)\(/g ) { print $1 }' in.txt
| NeedThis_1
| NeedThis_2

John> You could say:
John> keyword [0-9_A-Za-z]*]\(
John> And limit it to  characters

This is a pretty good suggestion anyway; most tokens are going to be
alphanumeric+underscore ... which, as it turns out, is exactly what
"\w" matches:

| $ perl -lnwe 'while ( /keyword (\w+)/g ) { print $1 }' in.txt
| NeedThis_1
| NeedThis_2

t.



More information about the Pdx-pm-list mailing list