【发布时间】:2011-11-21 10:39:36
【问题描述】:
为什么下面三个字符没有对称toLower,toUpper结果
/**
* Written in the Scala programming language, typed into the Scala REPL.
* Results commented accordingly.
*/
/* Unicode Character 'LATIN CAPITAL LETTER SHARP S' (U+1E9E) */
'\u1e9e'.toHexString == "1e9e" // true
'\u1e9e'.toLower.toHexString == "df" // "df" == "df"
'\u1e9e'.toHexString == '\u1e9e'.toLower.toUpper.toHexString // "1e9e" != "df"
/* Unicode Character 'KELVIN SIGN' (U+212A) */
'\u212a'.toHexString == "212a" // "212a" == "212a"
'\u212a'.toLower.toHexString == "6b" // "6b" == "6b"
'\u212a'.toHexString == '\u212a'.toLower.toUpper.toHexString // "212a" != "4b"
/* Unicode Character 'LATIN CAPITAL LETTER I WITH DOT ABOVE' (U+0130) */
'\u0130'.toHexString == "130" // "130" == "130"
'\u0130'.toLower.toHexString == "69" // "69" == "69"
'\u0130'.toHexString == '\u0130'.toLower.toUpper.toHexString // "130" != "49"
【问题讨论】:
-
也许是因为 Unicode 不明确?一些字形在 Unicode 中有多种表示形式,
toLower在toUpper之后,反之亦然,标准化为“最低”代码点。 -
Jeff Moser 的出色 Turkey Test post 尤其涵盖了土耳其语 I 问题。
标签: unicode uppercase lowercase symmetry case-conversion