[Melbourne-pm] case insensitive REs

Craig Sanders cas at taz.net.au
Tue May 6 22:23:58 PDT 2008


On Wed, May 07, 2008 at 02:11:23PM +1000, Tim Connors wrote:
> I want the user to be able to supply a -i flag to my program to make 
> global case insensitive searching.
>
> [...]
>
> Scalar found where operator expected at /home/ssi/tconnors/bin/phrasegrep 
> line 131, near "/($re)/g$case"
>         (Missing operator before $case?)
> 
> I would have expected perhaps a global variable in perhaps perlvar(1) 
> telling me I could force a global case insensitive match.
> 
> The only way around this that I can see is the butt ugly:
> 
> if ($case) {
>   while (/($re)/gi) {
>      ...
>   }
> } else 
>   while (/($re)/g) {
>      ...
>   }
> }

NOTE: the following is "Untested but it should work because i've done
similar stuff before and the docs say so too<TM>". remember that
trademark, it's your non-guarantee of quality :)


$re = '(?i)' . $re if ($case);
while (/($re)/g) {
  ...
} 

alternatively:

$mods = 'g';
$mods = 'i' . $mods if ($case);
$re = "(?$mods)$re";

while (/($re)/g) {
  ...
} 


from perlre(1):


    "(?imsx-imsx)"
        One or more embedded pattern-match modifiers, to be turned on
        (or turned off, if preceded by "-") for the remainder of the
        pattern or the remainder of the enclosing pattern group (if
        any). This is particularly useful for dynamic patterns, such as
        those read in from a configuration file, read in as an argument,
        are specified in a table somewhere, etc.  Consider the case that
        some of which want to be case sensitive and some do not.  The
        case insensitive ones need to include merely "(?i)" at the front
        of the pattern.  For example:

            $pattern = "foobar";
            if ( /$pattern/i ) { }

            # more flexible:

            $pattern = "(?i)foobar";
            if ( /$pattern/ ) { }

        These modifiers are restored at the end of the enclosing
        group. For example,

            ( (?i) blah ) \s+ \1

        will match a repeated (including the case!) word "blah" in any
        case, assuming "x" modifier, and no "i" modifier outside this
        group.






also remember: if $re is never going to change during the life of the
program, then you can gain a significant performance boost by using the
"/o" modifier. this compiles the regexp only once, which is very useful
if you're matching the same regexp repeatedly in a loop.

(digression: i just noticed that the /o modifier isn't mentioned in my
perlre man page, but it is discussed in the perlretut man page. odd.
perl v5.8.8)


so:


$re = '(?i)' . $re if ($case);
while (/($re)/go) {
  ...
} 

or:

$mods = 'go';
$mods = 'i' . $mods if ($case);
$re = "(?$mods)$re";

while (/($re)/g) {
  ...
} 






see also perlretut(1). search for the section "Embedding comments and
modifiers in a regular expression".

and just above that is a section on compiling and saving regexps (i.e.
the /o modifier).



craig

-- 
craig sanders <cas at taz.net.au>

BOFH excuse #451:

astropneumatic oscillations in the water-cooling


More information about the Melbourne-pm mailing list