SPUG: When is a caret just a caret? And what about dollar?
Joshua ben Jore
twists at gmail.com
Sun Oct 4 11:18:59 PDT 2009
On Fri, Aug 21, 2009 at 11:47 PM, Michael R. Wolf <MichaelRWolf at att.net> wrote:
> I seem to remember that the meaning of caret in a regex is
> context-sensitive, and it can be an anchor, a complement, or just (with
> apologies to Sigmond) a caret.
>
> My memory is that caret is an anchor iff it's the first character in a regex
> and a complement iff it's the first character in a character class, else
> it's a self-match. Ergo, /^^[^^]^/ would match "beginning of line then
> caret then anything-but-a-caret then caret". I can't seem to find support
> for this (long-held) belief. Have I been wrong for this long?
\^ means the character '^'
^ anywhere outside a character class and without //m on, means "match
the start of the string"
^ with //m means "match the start of the string OR the start of any new line
^ as the first character in a character class inverts it
^ is part of a variable if the variable's name is something like $^W
or ${^WARNING_BITS}
> And while I'm at it, how 'bout dollar? I thought it was an anchor iff it
> was the last character in the regex, else it introduced a scalar variable
> for interpolation.
\$ means the character '$'
Regexps using ' as the quote are non-interpolating so $ still means
the character '$'
$ interpolates if the name matches:
* normal names like $foo
* capture variables like $1 or $314159265
* names with leading control characters like $^X or ${^WARNING_BITS}
* punctuation like $$
* EXCEPT the variables $(, $|, $)
$ that hasn't interpolated matches:
* at the end of a string or before a \n at the end of a string if
/m wasn't on
* at all the middle newlines too if /m was on
The idea about what "normal" is changes depending on whether your
source code is ASCII, Unicode, or EBCDIC.
Josh
More information about the spug-list
mailing list