SPUG: International characters from input form
Kevin Fink
kevin-spug at fink.com
Tue Jan 16 16:23:21 PST 2007
HTTP_ACCEPT_LANGUAGE is a request header, so it indicates what languages
the browser would like to see from the server. The q parameter is a
weighting so the server can decide which language to send in the case
that it can produce several acceptable languages.
I don't know the answer to the real question, though. There is an
interesting paper at:
http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html
that talks about it, though.
Kevin
Gary Hawkins wrote:
> There's this webpage form where the user supplies some text, might be English,
> Swedish, Chinese, Portugese, Russian, Hebrew, Arabic ...
>
> To Perl, is it (or can it be made to be) clearcut which language the user is
> sending to my program, truly without question, or are there any opportunities
> for confusion or possible crossover in unicode-land.
>
> I do not want the user to have to tell me which language they are using, I want
> that to be determined programmatically, and hope to hear that someone has
> sorted all of that out already (Larry Wall and company or the folks at Apache)
> with no grey areas.
>
> On this:
>
> SERVER_SOFTWARE = Apache/1.3.34 (Unix) mod_layout/3.2
>
> ... I tried printing back %ENV with normal English text and received this:
>
> HTTP_ACCEPT_LANGUAGE = en-us,ja;q=0.5
>
> ... then tried inputing Japanese text instead and to my dismay saw the same
> thing:
>
> HTTP_ACCEPT_LANGUAGE = en-us,ja;q=0.5
>
> I would have been real happy to see this instead:
>
> HTTP_ACCEPT_LANGUAGE = ja,en-us;q=0.5
>
> Maybe there is a way I can tell from the following that they (I) used Japanese?
> I see it does appear to correctly reflect my 4 keyboard strokes for each field,
> but how am I to know it isn't Swahili?
>
> QUERY_STRING = Name1=%B6%C1%C4%C1&Name2=%BD%C1%BD%B2
>
> Does "en-us,ja" indicate the two languages installed on the user's system? If
> so, that would make sense and provide a clue (I have both English and Japanese
> keyboard inputs set up). What is 'q'? Now, if they happen to have Thai and
> Vietnamese, how am I to know which one they are using? Maybe Thai is %AF thru
> %D7 and Vietnamese is %D8 thru %FF or some such thing?
>
> Thanks,
>
> Gary Hawkins
>
>
>
> _____________________________________________________________
> Seattle Perl Users Group Mailing List
> POST TO: spug-list at pm.org
> SUBSCRIPTION: http://mail.pm.org/mailman/listinfo/spug-list
> MEETINGS: 3rd Tuesdays
> WEB PAGE: http://seattleperl.org/
More information about the spug-list
mailing list