[tpm] ucfirst() and unicode
Digimer
linux at alteeve.com
Wed Apr 7 12:12:31 PDT 2010
On 10-04-06 03:33 PM, Stuart Watt wrote:
> Digimer wrote:
>> From reading perldoc perlunicode, I was able to figure out why
>> ucfirst() wasn't doing anything; The data I am altering is coming from
>> a UTF8-encoded database. I also see the example of creating UTF8
>> compatible ToUpper(), ToLower(), etc.
>>
>> There isn't an example of a compatible ucfirst() alternative, and as I
>> read it, I'd need to create a custom function listing the
>> source->destination unicodes to convert... This seems tedious so,
>> given that laziness is the source of all code, I am guessing someone
>> has come up with another way. Failing that, is there such a function
>> already?
>>
>> My CPAN search for 'ucfirst unicode' failed (though it's always
>> possible that there is a PEBCAK).
>>
>> tl;dr - need a ucfirst() variant that works with Unicode strings.
> I think some of this is locale-specific, which is why it isn't obvious.
> i.e., what actually happens can vary from locale to locale. For example,
> é can be uppercased to E and É depending on which region you are in. See
> http://search.cpan.org/~dapm/perl-5.10.1/pod/perllocale.pod#Category_LC_CTYPE:_Character_Types
> <http://search.cpan.org/%7Edapm/perl-5.10.1/pod/perllocale.pod#Category_LC_CTYPE:_Character_Types>
> for some stuff.
>
> Just putting "use locale;" in your script might be a good place to start.
>
> All the best
> Stuart
>
This got me going in the right direction, thank you!
I had already been using 'use locale', but while looking into it I saw
that is plays with what perl interprets \l, \u, \L and \U to mean. From
that, I was able to create this little function that seems to work:
sub uppercase_first_letter
{
my ($word)=@_;
my $new="";
foreach my $char (split//, $word)
{
$char=~s/^(\w)/\l$1/;
$new.=$char;
}
$new=~s/^(\w)/\u$1/;
$word=$new;
return($word);
}
--
Digimer
E-Mail: linux at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin: http://nodeassassin.org
More information about the toronto-pm
mailing list