[Wellington-pm] More on Unicode and DB

Sam Vilain sam at vilain.net
Thu Sep 20 16:04:30 PDT 2007


Michael Robinson wrote:
> So, if you're trying to canonicalize text input, say in a search engine
> dealing with Maaori source documents, and you need to deal with the fact
> that some people input macrons, and some don't, then you probably also
> need to consult Unicode::Normalize.

And then there's the ones that the Unicode consortium are in denial about.

Like, ॐ vs ૐ, ઍ vs અૅ, ख़ vs ख़ etc.  Apparently fontmakers are supposed to
distinguish these combinations (much like ł must not look like t)
despite them appearing identical on the Unicode codecharts.

Sam.


More information about the Wellington-pm mailing list