On 7/17/07, David Glasser <firstname.lastname@example.org> wrote:
> On 7/17/07, Marc Haisenko <email@example.com> wrote:
> > Yes, you got it right, but I'll start over (but beware: I'm by no means an
> > expert; I hope I get it right).
> > Suppose you want to encode the character "ü" (umlaut-u). You can do it in two
> > ways in Unicode: either use the "composed" form, which is just one character:
> > umlaut-u. Or you can use the "decomposed" form, which is two characters: the
> > first character means "add umlaut to the subsequent character" followed by a
> > plain latin "u". A normal strcmp will say that the two representations are
> > different because they are two different byte streams.
> > Both forms have their advantages and disadvantages. Windows wants to store
> > Unicode filenames in "composed" form, while Mac OS X wants to store the
> > filenames in "decomposed" form. This leads to problems.
> > There are various solutions to this problem, but they all more or less require
> > to have some "real" Unicode handling: either by having a strcmp that doesn't
> > complain when one string is composed and the other decomposed, or you
> > normalize each and every filename to an agreed-on representation. As far as I
> > can see the later is propably less error-prone (only few well-known "entries"
> > for filenames exist) and requires less code change. Especially if the
> > composed representation is used, as that already works on Windows and Linux,
> > so now "only" the Mac OS X client needs to normalize them as well.
> Ah, I see. So what are the actual effects? Is it something like:
> * Linux user adds a file with a :u in it, which is stored composed
> * Mac user checks it out
> * Mac user edits the file
> * Mac user tries to commit; the commit request sends the name with a
> decomposed :u
> * Repository has no idea what file the mac user's client is talking about
Worse: the Mac client reports the versioned file as missing
immediately after checkout *and* reports a file (which looks *exactly*
the same to the user) as unversioned.
There's no committing to a file like that...
Received on Tue Jul 17 16:45:19 2007