[tpm] ucfirst() and unicode
Stuart Watt
stuart at morungos.com
Tue Apr 6 12:33:51 PDT 2010
Digimer wrote:
> From reading perldoc perlunicode, I was able to figure out why
> ucfirst() wasn't doing anything; The data I am altering is coming from
> a UTF8-encoded database. I also see the example of creating UTF8
> compatible ToUpper(), ToLower(), etc.
>
> There isn't an example of a compatible ucfirst() alternative, and as
> I read it, I'd need to create a custom function listing the
> source->destination unicodes to convert... This seems tedious so,
> given that laziness is the source of all code, I am guessing someone
> has come up with another way. Failing that, is there such a function
> already?
>
> My CPAN search for 'ucfirst unicode' failed (though it's always
> possible that there is a PEBCAK).
>
> tl;dr - need a ucfirst() variant that works with Unicode strings.
I think some of this is locale-specific, which is why it isn't obvious.
i.e., what actually happens can vary from locale to locale. For example,
é can be uppercased to E and É depending on which region you are in. See
http://search.cpan.org/~dapm/perl-5.10.1/pod/perllocale.pod#Category_LC_CTYPE:_Character_Types
<http://search.cpan.org/%7Edapm/perl-5.10.1/pod/perllocale.pod#Category_LC_CTYPE:_Character_Types>
for some stuff.
Just putting "use locale;" in your script might be a good place to start.
All the best
Stuart
More information about the toronto-pm
mailing list