the code won't fail, but you would do an UPDATE more often that you need to.<br><br>the key-value from the database is utf8, "foo=bar"<br>the key-value from the http form is, say, ebcdic, "foo=bar"<br>
<br>utf8 "bar" does not equal ebcdic "bar"<br><br>update database with decode("ebcdic", $input_form{foo}) , which will update the database with exactly the same string as what's already there.<br>
<br>to save on database updates, one would <br>
<br>my $charset_from_http_header;<br>... set $charset_from_http_header ...<br><br>while (my ($key, $value) = each %new_input_from_http_form)<br>
{<br> my $decoded_key = decode($charset_from_http_header, $key);<br> my $decoded_value = decode($charset_from_http_header, $value); ## or $new_input_from_http_form{$key}<br><br>
if ( $decoded_value ne $old_input_saved_in_db{$decoded_key} ) ## important that $decoded_key is used here<br>
{<br>
# Update the DB using the '$decoded_value' value.<br>
}<br>
}<br><br>well ... the http form keys are probably ascii, since you would have designed the form, and therefore $decoded_key is the same as $key. the form's key and value pairs will be encoded as per the http header charset. if the browser is set to an ebcdic code page, you'll be in trouble if you don't decode the key. that would be really wierd though, having a browser doing ebcdic. however, i wouldn't be surprised if there weren't other code pages that are not ascii friendly, like maybe mandarin, thai, tibet, whatever.<br>
<br>the binary representation of an ascii string is the same in latin1 as it is in any of the iso-8859 dialects, and in utf8. but if you render that same string in ebcdic, kansas goes bye bye.<br><br>-rob<br><br><div class="gmail_quote">
On Wed, Jul 9, 2008 at 12:50 PM, Madison Kelly <<a href="mailto:linux@alteeve.com">linux@alteeve.com</a>> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="Ih2E3d">Rob Janes wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
oops, first reply went to Madison alone ...<br>
<br>
I think this is better ...<br>
<br>
use Encode qw(from_to decode);<br>
<br>
my $data = "Résidence";<br>
from_to($data, "iso-8859-1", "utf8"); ## assuming Résidence is encoded in 8859-1<br>
or<br>
my $data = decode("iso-8859-1", "Résidence");<br>
<br>
both of these will create a utf8 string from Résidence. However, depending on the original encoding of Résidence, what's stored in the database may or may not be what you want.<br>
<br>
In other words, the lack of an error message is not indicative of it working.<br>
<br>
-rob<br>
</blockquote>
<br></div>
I am not sure if this is the most ... appropriate way to solve the problem, so I would still love to have some feedback if you (or anyone) has any.<br>
<br>
I got it working this way:<br>
<br>
- Read the data from the website and push it into a hash (hidden input values stored as "$input{name}=value;".<br>
- Loop through the '%input' hash keys and populate a new hash '%enc_input' with the same format.<br>
- Read the old values from the database into a matching hash called '%old_input'.<br>
- Pseduo-code:<br>
foreach $key (keys %input)<br>
{<br>
if ( $input{$key} ne $old_input{$key} )<br>
{<br>
# Update the DB using the '$enc_input' hash value.<br>
}<br>
}<br>
<br>
It's ugly as sin, but it seems like the only time I need to use 'Encode' functions are in the actual PgSQL INPUT or UPDATE calls; not in the comparison of the value either from the HTML page or from the database.<br>
<br>
Odd.<br>
<br>
Thanks for your help, Rob and Cees!<br>
<br>
Madi<br>
</blockquote></div><br>