SPUG: When is a caret just a caret? And what about dollar?

Joshua ben Jore twists at gmail.com
Sun Oct 4 11:18:59 PDT 2009


On Fri, Aug 21, 2009 at 11:47 PM, Michael R. Wolf <MichaelRWolf at att.net> wrote:
> I seem to remember that the meaning of caret in a regex is
> context-sensitive, and it can be an anchor, a complement, or just (with
> apologies to Sigmond) a caret.
>
> My memory is that caret is an anchor iff it's the first character in a regex
> and a complement iff it's the first character in a character class, else
> it's a self-match.  Ergo, /^^[^^]^/ would match "beginning of line then
> caret then anything-but-a-caret then caret".  I can't seem to find support
> for this (long-held) belief.  Have I been wrong for this long?

\^ means the character '^'
^ anywhere outside a character class and without //m on, means "match
the start of the string"
^ with //m means "match the start of the string OR the start of any new line
^ as the first character in a character class inverts it
^ is part of a variable if the variable's name is something like $^W
or ${^WARNING_BITS}

> And while I'm at it, how 'bout dollar?  I thought it was an anchor iff it
> was the last character in the regex, else it introduced a scalar variable
> for interpolation.

\$ means the character '$'
Regexps using ' as the quote are non-interpolating so $ still means
the character '$'
$ interpolates if the name matches:
    * normal names like $foo
    * capture variables like $1 or $314159265
    * names with leading control characters like $^X or ${^WARNING_BITS}
    * punctuation like $$
        * EXCEPT the variables $(, $|, $)
$ that hasn't interpolated matches:
    * at the end of a string or before a \n at the end of a string if
/m wasn't on
    * at all the middle newlines too if /m was on

The idea about what "normal" is changes depending on whether your
source code is ASCII, Unicode, or EBCDIC.

Josh


More information about the spug-list mailing list