[tpm] PostgreSQL INSERT/UTF-8 problem

Rob Janes janes.rob at gmail.com
Tue Jul 8 14:51:17 PDT 2008


oops, first reply went to Madison alone ...

I think this is better ...

use Encode qw(from_to decode);

my $data = "Résidence";
from_to($data, "iso-8859-1", "utf8"); ## assuming Résidence is encoded in
8859-1
or
my $data = decode("iso-8859-1", "Résidence");

both of these will create a utf8 string from Résidence.  However, depending
on the original encoding of Résidence, what's stored in the database may or
may not be what you want.

In other words, the lack of an error message is not indicative of it
working.

-rob

On Tue, Jul 8, 2008 at 4:41 PM, Rob Janes <janes.rob at gmail.com> wrote:

> methinks your perl script is encoded in iso-8859-1, or a windows code
> page.  just cause you can see the accent doesn't mean it's right.  set your
> editor to utf-8.  or use character conversions.
>
> use utf8;  ## not sure about this, is pragma
> $blob = utf8::encode( 'Résidence' )
>
> or
> use Encode;
> $blob = encode("utf8", 'Résidence' );
>
> Encode doesn't make any statements about the encoding of your script, it
> might be the better way.
>
> iconv is another possibility.
>
> look at the man page for charnames.
>
> use charnames ":full";
> print "R\N{LATIN SMALL LETTER E WITH ACUTE}sidence\n";
>
> -rob
>
>
> On Tue, Jul 8, 2008 at 3:29 PM, Madison Kelly <linux at alteeve.com> wrote:
>
>> Hi all, second question of the day!
>>
>>  I've got a problem INSERTing a value into my DB. It's a French character
>> 'é', and my DB is set to UTF8, but the error is:
>>
>> INSERT INTO customer_data (cd_cust_id, cd_variable, cd_value, added_user,
>> added_date, modified_user, modified_date) VALUES (1,
>> 'CustServiceTypeDisplay_F', 'Résidence', 1, now(), 1, now());
>>
>> DBD::Pg::db do failed: ERROR:  invalid byte sequence for encoding "UTF8":
>> 0xe97369
>> HINT:  This error can also happen if the byte sequence does not match the
>> encoding expected by the server, which is controlled by "client_encoding".
>>
>>  When I manually run the INSERT, it works, so I know the problem is in
>> perl somewhere. Now then, I setup my script with this:
>>
>> # Setup for UTF-8 mode.
>> binmode STDOUT, ":utf8:";
>> $ENV{'PERL_UNICODE'}=1;
>>
>>  When I create my PgSQL connection, I use:
>>
>> $dbh=DBI->connect($db_connect_string, $$conf{db}{user}, $$conf{db}{pass},
>> {
>>        RaiseError => 1,
>>        AutoCommit => 1,
>>        pg_enable_utf8 => 1
>> }
>> ) or die ...;
>>
>>  I push a pile of queries into an array (referenced) and run them like
>> this:
>>
>> # Sanity checks stripped for the email
>> $dbh->begin_work;
>> foreach my $query (@{$sql})
>> {
>>        print "Query: [$query]\n";
>>        $dbh->do($query) or $error.=$DBI::errstr.", ";
>> }
>> $dbh->commit;
>>
>>  Lastly, my database itself is set to UTF8:
>>
>> SET client_encoding = 'UTF8';
>>
>>  I've tried knocking out the 'pg_enable_utf8 => 1' line in case I was
>> dealing with double-encoding, but that didn't help.
>>
>>  Any tips/ideas?
>>
>> Thanks!
>>
>> Madi
>> _______________________________________________
>> toronto-pm mailing list
>> toronto-pm at pm.org
>> http://mail.pm.org/mailman/listinfo/toronto-pm
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.pm.org/pipermail/toronto-pm/attachments/20080708/5e64bc6b/attachment.html>


More information about the toronto-pm mailing list