B. Smith-Mannschott wrote:
> On 7/19/07, Matthias Wächter <matthias.waechter@tttech.com> wrote:
>> Eric,
>>
>> I am neither a Unicode nor Subversion (developer) expert, but let me
>> make my (verbose) point anyway.
>>
>> On 18.07.2007 16:15, Erik Huelsmann wrote:
>> > Unicode has 2 different representations
>>
>
> [... snip ...]
>
>> 5. What about Unicode code groups that represent one NFC symbol but
>> multiple NFD symbols that _cannot_ be re-translated to NFC? For
>> example, U+3374 SQUARE BAR [2] is a single code to represent the
>> character sequence 'bar' in square format. The given decomposition
>> is U+0062 U+0061 U+0072 which is the ASCII sequence 'bar'.
>> Certainly, re-coding to NFC will result in no change. Do we want to
>> disallow those? BTW: Is this correct, does OS X translate U+3374 to
>> this three-letter sequence?
>
> This is misleading. It's true for the NFKC and NFKD, the
> "compatibility" normalizations, which are lossy by design. NFD does
> not decompose SQUARE BAR.
>
> Do you know of an example where NFD->NFC->NFD is lossy?
I think this here might be one:
http://www.unicode.org/review/pr-29.html
Stefan
--
___
oo // \\ "De Chelonian Mobile"
(_,\/ \_/ \ TortoiseSVN
\ \_/_\_/> The coolest Interface to (Sub)Version Control
/_/ \_\ http://tortoisesvn.net
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Jul 19 17:49:11 2007