Posted on 2009-06-19 13:48:30-07 by pelagic
Locales and Unicode
Hello, I'm trying the following:
my $dt0 = DateTime->new( year => 2009, month => 3, day => 1, time_zone => 'Europe/Zurich', locale => 'de_CH-UTF8' ); print "'", $dt0->month_name, "'\n";
then from the shell:
$ perl test.pl | uniname character byte UTF-32 encoded as glyph name 0 0 000027 27 ' APOSTROPHE 1 1 00004D 4D M LATIN CAPITAL LETTER M Invalid UTF-8 code encountered at line 1, character 2, byte -1. The sequence is not a valid UTF-8 character because the first byte, value 0xE4, bit pattern 11100100, requires 2 continuation bytes, but of the immediately following bytes, byte 1, value 0x72, bit pattern 11100100 is not a valid continuation byte, since its high bits are not 10. ...
This is because it returns the month name in ISO-8859-1. That's often a good idea for Swiss-german text like in 'März', but I explicitely asked for UTF-8 here. What did I not understand here? Thanks for any hints! pelagic
Direct Responses: Write a response
Perl Weekly newsletter
A free weekly newsletter for people who are busy to read all the blogs. click here to check it out.