[Edinburgh-pm] Send UTF8 info across Socket connection

Alex Brelsfoard alex.brelsfoard at gmail.com
Thu Sep 9 08:19:36 PDT 2010

Thanks Murray and Marco.

Murray, I'd already checked out that very same link, but thanks.  It is a
handy one.
But I'd never used Devel::Peek before.  I'll definitely be playing with that

Marco, Thanks for the detailed explanation of things.  Really, it helped me
understand a few things further in depth

You'll both love what it turned out to be...... I was double-encoding the
Since I receive data in many formats, I convert them all into UTF8, and then
send the data through a socket to another server.
Once at the other end I was trying to encode them again.... sadly this made
them look a LOT like they would had they never been cleaned up in the first
place.  Hence my thinking that the encoding-correction wasn't working.
I had forgotten that I had tried to clean up the encoding at both ends...

Many thanks for your help.
At least I DID still learn something.


Marco Fontani <fontani at gmail.com> wrote:

> So send() expects bytes. If you're giving it characters, it will not
> do the right thing.
> To "get bytes from a string of utf8 characters", you can use my
> utf8::encode($str) if you know $str is utf8.
> The other side will then receive bytes and if it needs characters it
> will have to my utf8::decode($received);
> From perldoc utf8:
> # utf8::encode($string);

> # "\x{100}" becomes "\xc4\x80"

> # that is, utf8 string to bytes
> # utf8::decode($string); # "\xc4\x80" becomes "\x{100}" # that is,
> bytes to utf8 string
> Try the above, and let me know how it goes ;)
> Unicode is easy*!!!
> Just my 2 cents,
> -marco-
> * to get wrong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/edinburgh-pm/attachments/20100909/3f77eb87/attachment.html>

More information about the Edinburgh-pm mailing list