[Pdx-pm] regex & phone numbers...

Curtis Poe cp at onsitetech.com
Thu Jul 25 13:56:54 CDT 2002


----- Original Message -----
From: "Kari Chisholm" <karic at lclark.edu>
To: <pdx-pm-list at pm.org>
Sent: Thursday, July 25, 2002 10:23 AM
Subject: [Pdx-pm] regex & phone numbers...


>
> Hey all...
>
> Now that all that madness is past...  I've got another regex stumper.
>
> I want to clean up phone numbers that people submit in a form.  They
> could come in all kinds of ways, like...
>
> (503) 123-4567
> 503.123.4567
> (503)-123-4567
> 123-4567
> 503 123 4567
> 503-123-4567 ext. 89
> 123-4567 ext. 89
> 503-123-4567-mom's house
>
> I want to convert the actual seven- or ten-digit phone number part to
> just xxx-xxx-xxxx.  I also want to leave alone anything that comes
> after that - which is obviously the tough part.  The logic should be
> basically this: just process through the number left to right,
> grabbing the first seven or ten numbers, then reformat those and tack
> on whatever's left.  The challenge is figuring out when it's a
> seven-digit or a ten-digit number.
>
> I've conceptualized any number of highly complex and idiotic ways of
> doing this.  I'm just wondering if there's a simpler regex approach to
> this...  Any ideas?

Kari,

My suggestion, if you're allowed to do this:  rethink the problem.  Rather
than try to clean up their data, only allow them to enter data in a format
that *you* specify.  One way to do this is to use the following text on the
form, near the input box.

  Phone (xxx-xxx-xxxx format only):

Then, in your code:

  my $tainted_phone = param('phone');
  my ($phone) = $tainted_phone =~ /(\d\d\d-\d\d\d-\d\d\d\d)/;
  if ( ! $phone ) {
      # some error processing and send 'em back to the form
  }
  else {
      # do the right thing
  }

Of course, you'll have to customize that to your needs (such as allowing an
optional extension), but I find it easier to ensure that the user does
things right, rather than try to guess what the user did.

Of course, another trick is to provide separate input boxes for the area
code, exchange, number, extension, and perhaps a note.  Imagining that the
underscores are input boxes:

  ___-___-____ x _____  Note: _________

That's even easier to parse, but requires a bit more gruntwork.  You'll have
to let the users know what is mandatory and what is optional, though.

--
Cheers,
Curtis Poe
Senior Programmer
ONSITE! Technology, Inc.
www.onsitetech.com
503-233-1418

Taking e-Business and Internet Technology To The Extreme!




More information about the Pdx-pm-list mailing list