[sf-perl] help with matching?
david wright
david_v_wright at yahoo.com
Mon Jul 12 14:17:05 PDT 2010
>----- Original Message ----
>From: Fred Moyer <fred at redhotpenguin.com>
>To: extasia at extasia.org
>Cc: sfperl <sanfrancisco-pm at pm.org>
>Sent: Mon, July 12, 2010 11:55:38 AM
>Subject: Re: [sf-perl] help with matching?
>
>On Mon, Jul 12, 2010 at 11:30 AM, David Alban <extasia at extasia.org> wrote:
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 6 =>
>> 'srwd15abx001.srwd15.com'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 7 => 'srwd15abx001'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 8 => '.srwd15.com'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 9 => '.srwd15.com'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 10 => 'srwd15hst001'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 11 => '.srwd15.com'
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 12 => ''
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 13 => ''
>> 2010-07-12 17:46:21 +0000 srwd00reg001 junk.perl[356] 14 => ''
>> $VAR1 = qr/(?msx-i: \A \s* (?x-ism: ( ( \d{1,3} ) [.] ( \d{1,3} ) [.]
>> ( \d{1,3} ) [.] ( \d{1,3} ) ) ) \s+ (?x-ism: ( ( \w+ ) ( [.] \w+ [.]
>> \w+ )? ) ) \s+ (?x-ism: ( ( \w+ ) ( [.] \w+ [.] \w+ )? ) ) \s* \z )/;
>>
>> here is the line to parse (excluding the single quotes):
>>
>> '10.80.15.14 srwd15abx001.srwd15.com srwd15hst001.srwd15.com '
>>
>> here is my attempt to make the regex more human readable [edited above
>> line by hand--best viewed with non-proportional font:]
>>
>> qr/
>> (?msx-i:
>> \A
>> \s*
>> (?x-ism: (
>> ( \d{1,3} )
>> [.] ( \d{1,3} )
>> [.] ( \d{1,3} )
>> [.] ( \d{1,3} )
>> )
>> )
>> \s+
>> (?x-ism: (
>> ( \w+ )
>> (
>> [.] \w+ [.] \w+
>> )?
>> )
>> )
>> \s+
>> (?x-ism: (
>> ( \w+ )
>> (
>> [.] \w+ [.] \w+
>> )?
>> )
>> )
>> \s*
>> \z
>> )
>> /;
>>
>> what am i missing?
>
>IMHO this regex is too complicated. Why are you attempting to pull
>out the dotted quads of the ip address individually instead of
>grabbing the ip address with one match? Same for the host name - it
>is better (IMHO) to make a more readable piece of code that could be
>more computationally expensive.
>
>You might be able to simplify this using some other tools such as
>split(). 10 lines of readable code with comments trumps a single
>regex that one has to put effort into understanding.
I agree with Fred, this is a perfect use case for split.
#perl
while(<DATA>){
my ($ip, $vh1, $vh2) = split /\s+/;
print "$ip, $vh1, $vh2\n";
}
__DATA__
10.80.15.14 srwd15abx001.srwd15.com srwd15hst001.srwd15.com
10.80.15.13 srwd1abx001.srw15.com srwd1abx001.srw15.com
10.80.15.11 srwd1abx001.srw15.com srwd1abx001.srw15.com
# end
output:
[dwright~/perl(44)]$ perl pw_sp.pl
10.80.15.14, srwd15abx001.srwd15.com, srwd15hst001.srwd15.com
10.80.15.13, srwd1abx001.srw15.com, srwd1abx001.srw15.com
10.80.15.11, srwd1abx001.srw15.com, srwd1abx001.srw15.com
More information about the SanFrancisco-pm
mailing list