[Chicago-talk] help with Reg Exp, pls...

Walter Torres walter at torres.ws
Tue Dec 16 13:18:28 CST 2003

> -----Original Message-----
> From: chicago-talk-bounces at mail.pm.org
> [mailto:chicago-talk-bounces at mail.pm.org]On Behalf Of Steven Lembark
> Sent: Tuesday, December 16, 2003 12:44 PM
> To: Chicago.pm chatter
> Subject: RE: [Chicago-talk] help with Reg Exp, pls...

> Does your 99.99% include japanese, chinese, or russian
> individuals?

no sir. They do not use Roman based characters.

> If so then you probably have to deal with
> UTF in some form, at which point all best on what is
> "reasonable" are off. If you restrict the names to
> ASCII it works but then you reject all of the UTF8 accent's
> shown in the example.
> The simplest fix -- given that you may need to regex UTF
> describing a name in any obscure language -- may be to just
> use a check for [:isprint:] and be done with it: anything
> that is not a printing character is probably not something
> you want to deal with. It would also probably make sense
> to use "join ' ', split" to strip extra whitespace and
> convert all of it so singleton spaces.

Yes, that could work.



More information about the Chicago-talk mailing list