SPUG: RE question
Jacinta Richardson
jarich at perltraining.com.au
Wed Nov 16 17:55:44 PST 2005
Duane Blanchard wrote:
> $RE_year = "(19|20)\d\d";
> $RE_month = "(jan(uary)?|feb(ruary)?|mar(ch)?|apr(il)?|may|jun(e)?|jul(y)?|aug(ust)?|sep(tember)?|oct(ober)?|nov(ember)?|dec(ember)?)";
> $RE_day = "[0-3]?\d";
Separating things out like this is great. However, have you considered that
your parens in the above statements will affect $1, $2 etc? You probably mean
to be using non-capturing parens:
my $RE_year = qr/(?:19|20)\d\d/;
my $RE_month =
qr/(?:jan(?:uary)?|feb(?:ruary)?|mar(?:ch)?|apr(?:il)?|may|jun(?:e)?|jul(?:y)?|aug(?:ust)?|sep(?:tember)?|oct(?:ober)?|nov(?:ember)?|dec(?:ember)?)/i;
my $RE_day = qr/[0-3]?\d/;
> @array = ("1993 Mar 3", "1993 Mar 15", "Mar 15, 1993", "15 Mar
> 2001", "2001, 15 Mar");
>
> foreach $thing (@array)
> {
> # in the first disjunction, find any one of the defined REs, in the
> second, find any but the first one you found, etc.
> if ($line =~ /($RE_year|$RE_month|$RE_day),?\s*([^$1]($RE_year|$RE_month|$RE_day)),?\s*([^$1$2]($RE_year|$RE_month|$RE_day)))
> {print "You got a date: too bad it isn't with a girl.";}
As you've determined, this isn't going to do what you want it to. In the case
of "2001, 15 Mar" your pattern says:
((19|20)\d\d),?\s* # so, far so good, matches "2001, "
([^201]([0-3]?\d)) # oops, need something which isn't a 1.
# backtrack, match the space in this char class
# and then 1 with that second char class. The
# \d then matches the 5.
,?\s* # matches okay: "2001, 15 "
([^2019]((mar(ch)?)) # hmm, okay, match the 'm' in the first char
# class, fail to find 'ar' in the options.
# backtrack and give the space to the char
# class, match "mar"
# pattern should match.
I expect you'll find it easier to handle each configuration separately. It also
makes it a little easier to read your code.
my $RE_YMD = qr{$RE_year [,/-]? \s* $RE_month [/-]? \s* $RE_day}x;
my $RE_MDY = qr{$RE_month \s+ $RE_day ,? \s+ $RE_year}x;
my $RE_DMY = qr{$RE_day \s+ $RE_month \s+ $RE_year}x;
my $RE_YDM = qr{$RE_year ,? \s* $RE_day \s+ $RE_month}x;
Putting this all together should make all of the following examples work correctly.
my @array = ("1993 Mar 3", "1993 Mar 15", "Mar 15, 1993", "15 Mar 2001",
"2001, 15 Mar", "1993-Jan-31", "Jan 1 2000", "26 Jan 1988", "1976 01 Aug");
foreach my $date (@array)
{
if($date =~ m/($RE_YMD|$RE_MDY|$RE_DMY|$RE_YDM)/ix) {
print "$1 Matched!\n";
}
else {
"$date failed\n";
}
}
Hope this helps.
Jacinta
--
("`-''-/").___..--''"`-._ | Jacinta Richardson |
`6_ 6 ) `-. ( ).`-.__.`) | Perl Training Australia |
(_Y_.)' ._ ) `._ `. ``-..-' | +61 3 9354 6001 |
_..`--'_..-_/ /--'_.' ,' | contact at perltraining.com.au |
(il),-'' (li),' ((!.-' | www.perltraining.com.au |
More information about the spug-list
mailing list