[sf-perl] help with matching?

david wright david_v_wright at yahoo.com
Mon Jul 12 14:17:05 PDT 2010


>----- Original Message ----
>From: Fred Moyer <fred at redhotpenguin.com>
>To: extasia at extasia.org
>Cc: sfperl <sanfrancisco-pm at pm.org>
>Sent: Mon, July 12, 2010 11:55:38 AM
>Subject: Re: [sf-perl] help with matching?
>
>On Mon, Jul 12, 2010 at 11:30 AM, David Alban <extasia at extasia.org> wrote:
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 6 =>
>> 'srwd15abx001.srwd15.com'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 7 => 'srwd15abx001'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 8 => '.srwd15.com'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 9 => '.srwd15.com'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 10 => 'srwd15hst001'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 11 => '.srwd15.com'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 12 => ''
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 13 => ''
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 14 => ''
>> $VAR1 = qr/(?msx-i: \A \s* (?x-ism: ( ( \d{1,3} ) [.] ( \d{1,3} ) [.]
>> ( \d{1,3} ) [.] ( \d{1,3} ) ) ) \s+ (?x-ism: ( ( \w+ ) ( [.] \w+ [.]
>> \w+ )? ) ) \s+ (?x-ism: ( ( \w+ ) ( [.] \w+ [.] \w+ )? ) ) \s* \z )/;
>>
>> here is the line to parse (excluding the single quotes):
>>
>> '10.80.15.14     srwd15abx001.srwd15.com   srwd15hst001.srwd15.com  '
>>
>> here is my attempt to make the regex more human readable [edited above
>> line by hand--best viewed with non-proportional font:]
>>
>> qr/
>>  (?msx-i:
>>    \A
>>      \s*
>>        (?x-ism: (
>>                       ( \d{1,3} )
>>                   [.] ( \d{1,3} )
>>                   [.] ( \d{1,3} )
>>                   [.] ( \d{1,3} )
>>                 )
>>        )
>>        \s+
>>        (?x-ism: (
>>                   ( \w+ )
>>                   (
>>                     [.] \w+ [.] \w+
>>                   )?
>>                 )
>>        )
>>        \s+
>>        (?x-ism: (
>>                   ( \w+ )
>>                   (
>>                     [.] \w+ [.] \w+
>>                   )?
>>                 )
>>        )
>>      \s*
>>    \z
>>  )
>> /;
>>
>> what am i missing?
>
>IMHO this regex is too complicated.  Why are you attempting to pull
>out the dotted quads of the ip address individually instead of
>grabbing the ip address with one match?  Same for the host name - it
>is better (IMHO) to make a more readable piece of code that could be
>more computationally expensive.
>
>You might be able to simplify this using some other tools such as
>split().  10 lines of readable code with comments trumps a single
>regex that one has to put effort into understanding.



I agree with Fred, this is a perfect use case for split.

#perl
while(<DATA>){
    my ($ip, $vh1, $vh2)  = split /\s+/;
    print "$ip, $vh1, $vh2\n";
}


__DATA__
10.80.15.14    srwd15abx001.srwd15.com  srwd15hst001.srwd15.com
10.80.15.13    srwd1abx001.srw15.com  srwd1abx001.srw15.com
10.80.15.11    srwd1abx001.srw15.com  srwd1abx001.srw15.com
# end

output:
[dwright~/perl(44)]$ perl pw_sp.pl
10.80.15.14, srwd15abx001.srwd15.com, srwd15hst001.srwd15.com
10.80.15.13, srwd1abx001.srw15.com, srwd1abx001.srw15.com
10.80.15.11, srwd1abx001.srw15.com, srwd1abx001.srw15.com


More information about the SanFrancisco-pm mailing list