[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Umlaut problem on Mac (composed vs. decomposed UTF-8)

From: David Glasser <glasser_at_mit.edu>
Date: 2007-07-17 16:43:57 CEST

On 7/17/07, Marc Haisenko <haisenko@comdasys.com> wrote:
> Yes, you got it right, but I'll start over (but beware: I'm by no means an
> expert; I hope I get it right).
>
> Suppose you want to encode the character "ü" (umlaut-u). You can do it in two
> ways in Unicode: either use the "composed" form, which is just one character:
> umlaut-u. Or you can use the "decomposed" form, which is two characters: the
> first character means "add umlaut to the subsequent character" followed by a
> plain latin "u". A normal strcmp will say that the two representations are
> different because they are two different byte streams.
>
> Both forms have their advantages and disadvantages. Windows wants to store
> Unicode filenames in "composed" form, while Mac OS X wants to store the
> filenames in "decomposed" form. This leads to problems.
>
> There are various solutions to this problem, but they all more or less require
> to have some "real" Unicode handling: either by having a strcmp that doesn't
> complain when one string is composed and the other decomposed, or you
> normalize each and every filename to an agreed-on representation. As far as I
> can see the later is propably less error-prone (only few well-known "entries"
> for filenames exist) and requires less code change. Especially if the
> composed representation is used, as that already works on Windows and Linux,
> so now "only" the Mac OS X client needs to normalize them as well.

Ah, I see. So what are the actual effects? Is it something like:

* Linux user adds a file with a :u in it, which is stored composed
* Mac user checks it out
* Mac user edits the file
* Mac user tries to commit; the commit request sends the name with a
decomposed :u
* Repository has no idea what file the mac user's client is talking about

?

--dave

-- 
David Glasser | glasser_at_mit.edu | http://www.davidglasser.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Jul 17 16:43:08 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.