[tpm] Manipulating utf8 strings with Perl

Vinny Alves vinny at usestrict.net
Tue May 8 13:38:03 PDT 2012


Remove the -Mutf8 and it'll work with a UTF-8 file. I had some issues
converting a test file to UTF-16.

Vinny
http://cronblocks.com


On Tue, May 8, 2012 at 4:27 PM, Antonio Sun <antoniosun at lavabit.com> wrote:

> Thanks everyone for your replies.
>
> Yeah, I realized that the file is utf-16 afterward.
>
> On Tue, May 8, 2012 at 3:27 PM, Vinny Alves <vinny at usestrict.net> wrote:
>
>> perl -e 'binmode(STDIN,":encoding(UTF-8)"); while(<>){  *s/*<s:Body>*/*<s:Body xmlns:a=..
>>
>> .>*/** *}' < myfile.utf8
>>
>
> Thanks, that's exactly what I was looking for.
> However, I tried the following and it doesn't work.
>
> $ perl -Mutf8 -pe 'binmode(STDIN,":encoding(UTF-16)");
>  s/<s:Envelope/<s:Envelope xmlns:a=...>/' < myfile.utf8 > myfile.out
>
> $ cmp myfile.utf8 myfile.out && echo same
> same
>
> Seems that the replace strings ('s:Envelope...') are not handled as wide
> chars in Perl.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/toronto-pm/attachments/20120508/f046703e/attachment-0001.html>


More information about the toronto-pm mailing list