[sf-perl] RE oddity
Joseph Brenner
doom at kzsu.stanford.edu
Fri Feb 17 12:39:06 PST 2006
Rich Morin <rdm at cfcl.com> wrote:
> I'm a big fan of extended regular expressions, but I just
> wrote one that didn't work as I expected. This code:
>
> $line =~ s|[\000-\010 # nul-bs
> \012-\037 # nl-us
> \177-\377'] # del-... and '
> ||gx; # Punt weird characters.
>
> produced the nastygram:
>
> Invalid [] range "l-b" in regex;
> marked by <-- HERE in m/[\000-\010 # nul-b <-- HERE s
> \012-\037 # nl-us
> \177-\377'] # del-...
> / at /home/rdm/bin/log_load.pl line 206.
>
> but this code:
>
> $line =~ s|[\000-\010\012-\037\177-\377']||g;
>
> sails right through. Is this a bug or a (mis-)feature?
man perlre:
The "/x" modifier itself needs a little more explanation. It tells the
regular expression parser to ignore whitespace that is neither back-
slashed nor within a character class.
It's documented. It's a feature.
The gotcha I usually get stung on is assuming that /x does something to
the right hand side of a s///x:
s{ ^ (.*?) # capture first word to $1
/s # seperated by a space
(.*?) $ # capture second word to $2
}{$1 $2}x
That'll remove a space from between two items and then put it right back
again. (Of course, if you were trying to convert tabs to spaces, then
this could be useful).
More information about the SanFrancisco-pm
mailing list