[San-Diego-pm] Cold shower in UTF-8
Joel Fentin
joel at fentin.com
Sat Oct 26 06:02:40 PDT 2013
On 10/25/2013 7:19 PM, Brian Manning wrote:
> On Fri, Oct 25, 2013 at 4:22 PM, Joel Fentin <joel at fentin.com> wrote:
>> I did some web sites long ago. Their owner moved them to Network Solutions.
>> Network Solutions suddenly and without prior notice changed the MySQL
>> character encoding to UTF-8. There are fields in the database which are
>> displayed on webpages. I have some cleanup to do.
>>
>> Is there an industry standard for putting CR &/or LF into such a database
>> text field? Or does everyone roll his own?
>
> A SQL UPDATE using the output of a SELECT * from your existing tables
> should work I should think.
>
> You may also be able to drop then recreate the tables using the same
> encoding you used before. That would be up to NetSol.
>
>> Are there an industry standards for áéíñóúÁÉÍÑÓÚ¡¿
>
> Yes, they're called ISO standards and/or Unicode standards, depending
> on what the encoding of your existing text is. You could use 'iconv'
> or 'enca/enconv' to detect and/or convert between your source
> encodings to UTF-8. You could also use *cough*PERL*cough*, but it's
> probably easier/quicker/faster to use existing tools built for this
> purpose than to roll your own in *cough*PERL*cough*.
>
> Thanks,
>
> Brian
Either you don't understand my problem or I don't understand you
or both. But I appreciate your and Russ's efforts.
Before the MySQL conversion, the operator would type the following
into a text area:
line1 + [enter key] + line2 + [enter key] + line3
When they were done, they would click an OK button.
I ran what they typed thru the following code before putting it
into the database:
$Value =~ s/\15//g; #snuff chr 13 (may screw up db file)
$Value =~ s/\n/¶/g; #convert chr 10 to ¶
In this case I arbitrarily chose ¶ to represent LF.
To later access this for display on a webpage, I took what was in
the database and ran it through this:
$Value =~ s/¶/<br>/g;
The displayed result looked like this:
line1
line2
line3
======================
If I attempt this now, I can do the same thing, but would have to
replace the display code (above) with:
$Value =~ s/¶/<br \/>/g;
This because ¶ is greater than chr 127.
======================
Rather than roll my own, I'd rather go with a standard. I confess,
when I go to http://en.wikipedia.org/wiki/UTF-8
I don't quite grasp the Description nor the codepage layout. They
give an example of €. I can't follow it. Worse, I don't know how
much I need to know and how much I don't.
--
Joel Fentin tel: 760-749-8863
Biz Website: http://fentin.com
Personal Website: http://fentin.com/me
More information about the San-Diego-pm
mailing list