[Wellington-pm] More on Unicode and DB

Srdjan srdjan at catalyst.net.nz
Mon Sep 17 21:37:12 PDT 2007

Hash: SHA1

Is that deficiency of DBI::Pg or postgres client library?

Grant McLean wrote:
> Hi Mongers
> In my talk about Unicode the other month, one subject I didn't get time
> to explore properly was working with databases.  I've now done a little
> more of work in this area with Postgres.
> The initial question was how to 'round-trip' a Unicode string.
> For example say I insert a row with a text field containing the string
> 'Māori' (or "M\x{101}ori" in ASCII Perl).  This string is 5 characters
> long, but 6 bytes long because the second character is a multibyte
> character.
> I found that if I subsequently queried that value what I got back was
> effectively a binary string.  It's the same 6 bytes I inserted but
> without the UTF8 flag set on the Perl scalar, so it appears to be a 6
> character string.
> I could use Encode::decode to translate that into a scalar with the UTF8
> flag set but that's a bit tedious.
> It turns out that the Postgres driver (DBD::Pg) has a flag which can be
> set to do the decoding automatically, e.g.:
>   my $attr = {
>       RaiseError     => 1,
>       AutoCommit     => 0,
>       pg_enable_utf8 => 1,
>   };
>   my $dbh = DBI->connect($dsn, $user, $passwd, $attr);
> With the pg_enable_utf8 flag set, Unicode strings are round-tripped
> correctly automatically.
> It's mildly concerning that this flag is labelled "experimental and may
> be subject to change" but I'm sure if it was changed it would only be
> for the better :-)
> Cheers
> Grant
> _______________________________________________
> Wellington-pm mailing list
> Wellington-pm at pm.org
> http://mail.pm.org/mailman/listinfo/wellington-pm
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org


More information about the Wellington-pm mailing list