[Edinburgh-pm] Send UTF8 info across Socket connection

Alex Brelsfoard alex.brelsfoard at gmail.com
Thu Sep 9 08:19:36 PDT 2010


Thanks Murray and Marco.

Murray, I'd already checked out that very same link, but thanks.  It is a
handy one.
But I'd never used Devel::Peek before.  I'll definitely be playing with that
one.

Marco, Thanks for the detailed explanation of things.  Really, it helped me
understand a few things further in depth

You'll both love what it turned out to be...... I was double-encoding the
strings.
Since I receive data in many formats, I convert them all into UTF8, and then
send the data through a socket to another server.
Once at the other end I was trying to encode them again.... sadly this made
them look a LOT like they would had they never been cleaned up in the first
place.  Hence my thinking that the encoding-correction wasn't working.
I had forgotten that I had tried to clean up the encoding at both ends...
UGH....

Many thanks for your help.
At least I DID still learn something.

--Alex


Marco Fontani <fontani at gmail.com> wrote:

>
> So send() expects bytes. If you're giving it characters, it will not
> do the right thing.
> To "get bytes from a string of utf8 characters", you can use my
> utf8::encode($str) if you know $str is utf8.
>
> The other side will then receive bytes and if it needs characters it
> will have to my utf8::decode($received);
>
> From perldoc utf8:
> # utf8::encode($string);



> # "\x{100}" becomes "\xc4\x80"



> # that is, utf8 string to bytes
> # utf8::decode($string); # "\xc4\x80" becomes "\x{100}" # that is,
> bytes to utf8 string
>
> Try the above, and let me know how it goes ;)
>
> Unicode is easy*!!!
>
> Just my 2 cents,
> -marco-
>
> * to get wrong
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/edinburgh-pm/attachments/20100909/3f77eb87/attachment.html>


More information about the Edinburgh-pm mailing list