[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Let's discuss about unicode compositions for filenames!

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Mon, 30 Jan 2012 16:10:05 +0000 (GMT)

Let me just note some of the main similarities and differences between this issue of Unicode compositions and the issue of case-sensitivity in file names.
Differences:

  * NFC and NFD look the same when
displayed, and most users haven't heard of them and don't expect that a computer might treat two
identical-looking filenames as different.  With letter case, most users are aware that some systems treat upper and lower case letters as the same while other systems treat them as different, and they learn to behave according to the system's rules.

  *The main
case-insensitive file systems are case-preserving with no "normal form", whereas the main system that treats NFC and NFD as equivalent(MacOS) chooses one form as the "normal form" and always normalizes the given file name to that form.

Similarities:
  * If two Unicode strings differ only by letter case, on some computer systems they refer to the same file, while on other systems they refer to different files.  The rules are created by the
designers of the systems, sometimes explicitly and sometimes
implicitly.  Different parts of a system can have different rules.  The
same applies if two Unicode strings differ only by composition.

  * Subversion interoperates with different systems.  When two file names that differ only by letter case are transferred from a
case-sensitive system to a case-insensitive system, they will collide
and Subversion shouldhandle thisin some friendly way.  The same applies if two file namesdiffer only by composition.

The differences are important, but the similarities are enough that we should be looking for some commonality in the implementation.

- Julian
Received on 2012-01-30 17:10:40 CET

This is an archived mail posted to the Subversion Dev mailing list.