[San-Diego-pm] accents
Tkil
tkil-sdpm at scrye.com
Thu Oct 28 14:18:26 CDT 2004
>>>>> "Tkil" == Tkil <tkil at scrye.com> writes:
Tkil> For more generic cases, what you want to find is something that
Tkil> will "canonicalize" the unicode into one of two base forms (but
Tkil> preferably "C" form, which uses combining marks whenever
Tkil> possible). Fortunately, there is a standard Unicode::Normalize
Tkil> module to do this for you. First, I have to justify
Tkil> it:...............
>>>>> "Joel" == Joel Fentin <joel at fentin.com> writes:
Joel> From that point in your email onward, I stopped understanding.
I suspect that you actually stopped reading, or stopped trying to
understand.
I was trying to explain *why* it was a problem in the first place.
The fact that your mail arrived butchered was a great example of why
it's a problem. But if you don't understand encodings, then you're
going to lose.
Joel> One thing seemed clear is that it didn't reek of fell-swoop. I
Joel> didn't see anything cookbook-ish that I could build upon. Thank
Joel> you anyhow.
I was trying to explain the problem; you wanted an instant solution,
which is not what I was providing. Put another way: I was trying to
teach you how to fish. You were looking for a fish handout.
Joel> My goal is similar to that of a search engine. Take a word or a
Joel> phrase and check it against a longer hunk of text. Yes there is
Joel> a match or no there isn't. The i modifier to m// takes care of
Joel> case. And it seems Convert::Translit takes care of accents.
Glad that your current problem is solved. Consider the following
situations, though:
1. In Spanish, "ll" and "ch" are sometimes treated as "one character"
(e.g. for collating purposes).
2. In German, there is a single lower-case character (ess-zet, the one
that looks like a beta)... but in capital letters, it's written
"SS". What searches should work here?
And your comment of "there is a match or there isn't" is itself vague.
You have to more carefully specify what makes a match and what
doesn't. You might know -- but we don't, so we're left to guess.
I guess I'm just venting some frustration that you are asking for a
solution, but seem uninterested in learning about the basics that
would help you form your own solution.
t.
More information about the San-Diego-pm
mailing list