[Chicago-talk] help with Reg Exp, pls...

Walter Torres walter at torres.ws
Mon Dec 15 22:28:07 CST 2003


> -----Original Message-----
> From: chicago-talk-bounces at mail.pm.org
> [mailto:chicago-talk-bounces at mail.pm.org]On Behalf Of Steven Lembark
> Sent: Monday, December 15, 2003 5:15 PM
> To: Chicago.pm chatter
> Subject: Re: [Chicago-talk] help with Reg Exp, pls...


> Not really sure what you are asking for. "a test" could mean
> nearly anything...

I'm sorry kind people, I really didn't mean to be so obtuse about this...

I have a single string of unknown length and composition that I need to make
sure doesn't contain characters that are not used by 99.99999% of the Roman
language based population of this planet.

Some examples (by no means complete)

   Dr. Roger S. O'Malley Jr., PHD
   Mrs. Sara Harris-Henderson
   Manuel Gonzalez              <-- should contain extended chars
   Forsok Bokstaver             <-- should contain extended chars
   Contem Espaco-Valido         <-- should contain extended chars

These are the rules I came up with to try and define this...

assuming:
 - a SPACE is used to delimit parts of name from another [Walter Torres]
 - a HYPHEN is used to separate a dual name (British style) [Conrad-Smyth]
 - an APOSTROPHE is used in many Irish and Scottish names [O'Reilly]
 - a PERIOD is used in titles, suffixes [Dr. Sr. Jr.]
 - a PERIOD is used for Initials [Walter G. Torres]
 - a COMMA is used to delimit a citation of some sort [Dr. Samuel
Ellis-Honing, PHD]

then...
 - allow apostrophes, but only if preceded *and* followed by a Alpha/Extend
 - allow hyphen/dash, but only if preceded *and* followed by a Alpha/Extend
 - allow PERIOD, but only if preceded by a Alpha/Extend *and* followed by a
   SPACE or EOS
 - allow SPACE, but only if preceded by a Alpha/Extend, PERIOD or COMMA
   *and* followed by a Alpha/Extend
 - allow COMMA, but only if preceded by a Alpha/Extend *and* followed by a
SPACE

Based upon these rules, I've come up with...

     /^([a-z\x80-\xFF]+(. )?[ ]?)+$/i

It works, sort of..

 - it does not handle the COMMA, HYPHEN or apostrophe rules.
 - it allows (multiple) spaces at the end.

I'm hoping someone smarter than me can see how to make this work as desired.

Steve, I hope this explains what I'm after a bit clearer.

Thanks for your time.

Walter





More information about the Chicago-talk mailing list