[tpm] Manipulating utf8 strings with Perl

Antonio Sun antoniosun at lavabit.com
Tue May 8 13:27:25 PDT 2012

Thanks everyone for your replies.

Yeah, I realized that the file is utf-16 afterward.

On Tue, May 8, 2012 at 3:27 PM, Vinny Alves <vinny at usestrict.net> wrote:

> perl -e 'binmode(STDIN,":encoding(UTF-8)"); while(<>){  *s/*<s:Body>*/*<s:Body xmlns:a=..
> .>*/** *}' < myfile.utf8

Thanks, that's exactly what I was looking for.
However, I tried the following and it doesn't work.

$ perl -Mutf8 -pe 'binmode(STDIN,":encoding(UTF-16)");
 s/<s:Envelope/<s:Envelope xmlns:a=...>/' < myfile.utf8 > myfile.out

$ cmp myfile.utf8 myfile.out && echo same

Seems that the replace strings ('s:Envelope...') are not handled as wide
chars in Perl.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/toronto-pm/attachments/20120508/b8445406/attachment.html>

More information about the toronto-pm mailing list