[Fwd: SPUG:regex question]

ced at carios2.ca.boeing.com ced at carios2.ca.boeing.com
Tue Jan 28 14:23:00 CST 2003


>... I'm looking for a way to catch and remove,
> or possibly change, the text content of a WORD doc when it is
>cut-n-paste from WORD into a web textarea box.

> I was hoping there was a regex such as: $text =~
> /non_printable_chars//g;

> I like to to remove everything that is not a standard printable letter,

You might use J.Friedl's  character class to catch "viewable" ASCII,
[!-~]   or  [\x21-\x7e]  which is longer but shows more clearly
         that you're after a character encoding range]  

Some European character encodings in ISO-8859-1 would be clobbered 
though: umlauted u, for example.   

But, if this isn't an issue, one way:   $text =~ s/[^!-~]//g; 

--
Charles DeRykus



More information about the spug-list mailing list