[tpm] PostgreSQL INSERT/UTF-8 problem
Madison Kelly
linux at alteeve.com
Wed Jul 9 04:52:02 PDT 2008
Thanks for the reply, Rob!
I will play with Encode as soon as I get into the office. As for the
source encoding, let me follow that up in my reply to Cees.
Madi
Rob Janes wrote:
> oops, first reply went to Madison alone ...
>
> I think this is better ...
>
> use Encode qw(from_to decode);
>
> my $data = "Résidence";
> from_to($data, "iso-8859-1", "utf8"); ## assuming Résidence is encoded
> in 8859-1
> or
> my $data = decode("iso-8859-1", "Résidence");
>
> both of these will create a utf8 string from Résidence. However,
> depending on the original encoding of Résidence, what's stored in the
> database may or may not be what you want.
>
> In other words, the lack of an error message is not indicative of it
> working.
>
> -rob
>
> On Tue, Jul 8, 2008 at 4:41 PM, Rob Janes <janes.rob at gmail.com
> <mailto:janes.rob at gmail.com>> wrote:
>
> methinks your perl script is encoded in iso-8859-1, or a windows
> code page. just cause you can see the accent doesn't mean it's
> right. set your editor to utf-8. or use character conversions.
>
> use utf8; ## not sure about this, is pragma
> $blob = utf8::encode( 'Résidence' )
>
> or
> use Encode;
> $blob = encode("utf8", 'Résidence' );
>
> Encode doesn't make any statements about the encoding of your
> script, it might be the better way.
>
> iconv is another possibility.
>
> look at the man page for charnames.
>
> use charnames ":full";
> print "R\N{LATIN SMALL LETTER E WITH ACUTE}sidence\n";
>
> -rob
>
>
> On Tue, Jul 8, 2008 at 3:29 PM, Madison Kelly <linux at alteeve.com
> <mailto:linux at alteeve.com>> wrote:
>
> Hi all, second question of the day!
>
> I've got a problem INSERTing a value into my DB. It's a French
> character 'é', and my DB is set to UTF8, but the error is:
>
> INSERT INTO customer_data (cd_cust_id, cd_variable, cd_value,
> added_user, added_date, modified_user, modified_date) VALUES (1,
> 'CustServiceTypeDisplay_F', 'Résidence', 1, now(), 1, now());
>
> DBD::Pg::db do failed: ERROR: invalid byte sequence for
> encoding "UTF8": 0xe97369
> HINT: This error can also happen if the byte sequence does not
> match the encoding expected by the server, which is controlled
> by "client_encoding".
>
> When I manually run the INSERT, it works, so I know the problem
> is in perl somewhere. Now then, I setup my script with this:
>
> # Setup for UTF-8 mode.
> binmode STDOUT, ":utf8:";
> $ENV{'PERL_UNICODE'}=1;
>
> When I create my PgSQL connection, I use:
>
> $dbh=DBI->connect($db_connect_string, $$conf{db}{user},
> $$conf{db}{pass},
> {
> RaiseError => 1,
> AutoCommit => 1,
> pg_enable_utf8 => 1
> }
> ) or die ...;
>
> I push a pile of queries into an array (referenced) and run
> them like this:
>
> # Sanity checks stripped for the email
> $dbh->begin_work;
> foreach my $query (@{$sql})
> {
> print "Query: [$query]\n";
> $dbh->do($query) or $error.=$DBI::errstr.", ";
> }
> $dbh->commit;
>
> Lastly, my database itself is set to UTF8:
>
> SET client_encoding = 'UTF8';
>
> I've tried knocking out the 'pg_enable_utf8 => 1' line in case
> I was dealing with double-encoding, but that didn't help.
>
> Any tips/ideas?
>
> Thanks!
>
> Madi
More information about the toronto-pm
mailing list