[Chicago-talk] Help with Regex

James E Keenan jkeenan at pobox.com
Sat Dec 24 08:50:14 PST 2022


On 12/24/22 10:59, Richard Reina wrote:
> Happy Holidays Perl Family,
> 
> I am trying to use the regex below to find City, ST. Zip in a file. While the below does work for instances like Chicago, IL 60614 or Dallas, TX 75234,
> it does not work in instances with multi word cities like Salt Lake City, UT 89159 or in instances with nine digit zip codes like Tampa, FL 33592-2787.
> Any help in getting my regex to work would be greatly appreciated.
>   
> 
> 
>   if ($row =~ /^([^,]+),\s([A-Z]{2})(?:\s(\d{5}-?\d{4}?))?$/) {
>        
>        print "I think I found a city state zip:\n";
>        print "$row\n";
>        chomp (my $ff=<STDIN>);
>              
>        }
> 

I found that, as written, your regex does precisely the opposite of what 
you claimed it did.

#####
sub p {
     my $address = shift;
     my ($city, $state, $zip);
     my $regex = qr/^([^,]+),\s([A-Z]{2})(?:\s(\d{5}-?\d{4}?))?$/;
     if ($address =~ m/$regex/) {
         my ($city, $state, $zip) = ($1,$2,$3);
         print "$city, $state $zip\n";
     }
     else {
         print "No match\n";
     }
}

p("Chicago, IL 60614");
p("Chicago, IL 60614-0000");
p("Chicago, IL 606140001");
p("Dallas, TX 75234");
p("Salt Lake City, UT 89159");
p("Tampa, FL 33592-2787");

No match
Chicago, IL 60614-0000
Chicago, IL 606140001
No match
No match
Tampa, FL 33592-2787
#####

In this portion of your pattern ...

#####
(?:\s(\d{5}-?\d{4}?))?
#####

... the '?:' at the beginning means "cluster, but don't capture".  (See 
'perldoc perlre'.)

The following worked for me.

#####
sub r {
     my $address = shift;
     my ($city, $state, $zip);
     my $regex = qr/^
         ([^,]+)
         ,\s
         ([A-Z]{2})
         \s
         (\d{5}(?:-?\d{4})?)
     $/x;
     if ($address =~ m/$regex/) {
         my ($city, $state, $zip) = ($1,$2,$3);
         print "$city, $state $zip\n";
     }
     else {
         print "No match\n";
     }
}

r("Chicago, IL 60614");
r("Chicago, IL 60614-0000");
r("Chicago, IL 606140001");
r("Dallas, TX 75234");
r("Salt Lake City, UT 89159");
r("Tampa, FL 33592-2787");

Chicago, IL 60614
Chicago, IL 60614-0000
Chicago, IL 606140001
Dallas, TX 75234
Salt Lake City, UT 89159
Tampa, FL 33592-2787
#####



More information about the Chicago-talk mailing list