SPUG: International characters from input form

Kevin Fink kevin-spug at fink.com
Tue Jan 16 16:23:21 PST 2007


HTTP_ACCEPT_LANGUAGE is a request header, so it indicates what languages 
the browser would like to see from the server. The q parameter is a 
weighting so the server can decide which language to send in the case 
that it can produce several acceptable languages.

I don't know the answer to the real question, though. There is an 
interesting paper at:

http://www.mozilla.org/projects/intl/UniversalCharsetDetection.html

that talks about it, though.

Kevin

Gary Hawkins wrote:
> There's this webpage form where the user supplies some text, might be English,
> Swedish, Chinese, Portugese, Russian, Hebrew, Arabic ...
> 
> To Perl, is it (or can it be made to be) clearcut which language the user is
> sending to my program, truly without question, or are there any opportunities
> for confusion or possible crossover in unicode-land.  
> 
> I do not want the user to have to tell me which language they are using, I want
> that to be determined programmatically, and hope to hear that someone has
> sorted all of that out already (Larry Wall and company or the folks at Apache)
> with no grey areas.
> 
> On this:
> 
> SERVER_SOFTWARE = Apache/1.3.34 (Unix) mod_layout/3.2
> 
> ... I tried printing back %ENV with normal English text and received this:
> 
> HTTP_ACCEPT_LANGUAGE = en-us,ja;q=0.5
> 
> ... then tried inputing Japanese text instead and to my dismay saw the same
> thing:
> 
> HTTP_ACCEPT_LANGUAGE = en-us,ja;q=0.5
> 
> I would have been real happy to see this instead:
> 
> HTTP_ACCEPT_LANGUAGE = ja,en-us;q=0.5
> 
> Maybe there is a way I can tell from the following that they (I) used Japanese?
> I see it does appear to correctly reflect my 4 keyboard strokes for each field,
> but how am I to know it isn't Swahili?
> 
> QUERY_STRING = Name1=%B6%C1%C4%C1&Name2=%BD%C1%BD%B2
> 
> Does "en-us,ja" indicate the two languages installed on the user's system?  If
> so, that would make sense and provide a clue (I have both English and Japanese
> keyboard inputs set up).  What is 'q'?  Now, if they happen to have Thai and
> Vietnamese, how am I to know which one they are using?  Maybe Thai is %AF thru
> %D7 and Vietnamese is %D8 thru %FF or some such thing?
> 
> Thanks,
> 
> Gary Hawkins
> 
> 
> 
> _____________________________________________________________
> Seattle Perl Users Group Mailing List  
>      POST TO: spug-list at pm.org
> SUBSCRIPTION: http://mail.pm.org/mailman/listinfo/spug-list
>     MEETINGS: 3rd Tuesdays
>     WEB PAGE: http://seattleperl.org/


More information about the spug-list mailing list