[Pdx-pm] regex & phone numbers...
Joshua Keroes
jkeroes at eli.net
Thu Jul 25 14:02:03 CDT 2002
On (Thu, Jul 25 10:23), Kari Chisholm wrote:
> I want to clean up phone numbers that people submit in a form. They
> could come in all kinds of ways, like...
>
> (503) 123-4567
> 503.123.4567
> (503)-123-4567
> 123-4567
> 503 123 4567
> 503-123-4567 ext. 89
> 123-4567 ext. 89
> 503-123-4567-mom's house
>
> I want to convert the actual seven- or ten-digit phone number part
> to just xxx-xxx-xxxx. [snip] I've conceptualized any number of
> highly complex and idiotic ways of doing this. I'm just wondering
> if there's a simpler regex approach to this... Any ideas?
Just make sure you trim the PDX-pm email footer off the DATA section.
-Joshua
#!/usr/local/bin/perl -w
use strict;
our $AREACODE = 503; # Output default
our $DELIM = '-'; # Output default
my $areacode_re = qr/\(? ( \d{3} )? \)?/x;
my $delim_re = qr/[-. ]/;
my $lastseven_re = qr/( \d{3} ) $delim_re ( \d{4} )/x;
my $ext_delim_re = qr/(?:ext\s? | ext| x)/x;
my $ext_re = qr/( \d+ )/x;
my $phone_re = qr/
$areacode_re \s*
$delim_re? \s*
$lastseven_re \s*
$ext_delim_re? \s*
$ext_re?
/x;
while (<DATA>) {
chomp;
my $nice = format_phone($_) || '?';
printf "%25s => %s\n", $_, $nice;
}
exit;
# subs
sub format_phone {
my $ugly = shift or die "Didn't get a phone number. Aborting";
my ($areacode, $mid3, $last4, $ext) = $ugly =~ $phone_re;
unless ($mid3 && $last4) {
warn "Unable to parse phone number: '$ugly'";
return;
}
$areacode ||= $AREACODE;
my $nice = join $DELIM, ($areacode, $mid3, $last4);
$nice .= " x$ext" if $ext;
return $nice;
}
__DATA__
(503) 123-4567
503.123.4567
(503)-123-4567
123-4567
503 123 4567
503-123-4567 ext. 89
123-4567 ext. 89
503-123-4567-mom's house
More information about the Pdx-pm-list
mailing list