[Melbourne-pm] basic regex - char array vs "pipe brackets"
wigs at stirfried.org
wigs at stirfried.org
Wed Mar 4 21:02:34 PST 2009
On Thu, Mar 05, 2009 at 03:10:47PM +1100, Christopher Short wrote:
> I saw this regex in someone else's code (to check validity of an id field)
> /^(\w|\@|\.|\-)+$/
>
> and decided that it wouldn't work properly, that what they'd meant was
> the character array
> /^[\w\@\.\-]+$/
>
> Luckily I chucked it into Regex Coach and found both of them worked
> just as well.
> Thing is, when I look at
> /^(\w|\@|\.|\-)+$/
The '+' here means one or more repetion of the previous regex, not
whatever it happens to match.
> Turns out the long-term perl developer who wrote that code always uses
> "pipe brackets" instead of character arrays.
> Do they really function identically?
The biggest difference between character arrays and the parens grouping
is that the character array allows for a compact specification of ranges,
eg:
[a-z] is the same as (?:a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z)
As such, the hyphen has special meaning within the character set,
this is often a source of surprise to some perl developers when the
regex doesn't do what they expect.
Regards,
--
Aaron
More information about the Melbourne-pm
mailing list