[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Umlaut problem on Mac (composed vs. decomposed UTF-8)

From: Erik Huelsmann <ehuels_at_gmail.com>
Date: 2007-07-16 17:26:02 CEST

On 7/16/07, Thomas Singer <subversion@smartcvs.com> wrote:
> Hi Erik,
>
> I'm no C(++)-developer and hence could not be of direct help to the
> Subversion team when fixing the issue. I just get enough complains about our
> SmartSVN not working "right" with umlauts/accents in file names on the Mac.
> But before it can be "right" in SVNKit, the core of SmartSVN, it must be
> right in Subversion itself (actually, it is the reference implementation).
> If we would fix it ourself in SVNKit without a fix in Subversion, there
> might be a chance of incompatibility which we want to avoid at all.
>
> We've got information that setting the "right" system locale allows
> Subversion to add/commit files with umlauts/accents in the file name on the
> Mac (in the decomposed form). First, I don't understand, why a system locale
> (which should be responsible for dates/times/currency or the UI language)
> has influence on Subversion's (or C(++)'s?) capability to convert the
> reported file names to UTF-8 (in Java it is done magically in the
> background; there is no way to the file names in non-UTF-8).

Because the locale also affects LC_CTYPE, which reflects the way
characters are encoded. And without that information, Subversion
doesn't know what the source encoding is...

Now, the above is the general case for Unixen, but I believe the Mac
may be a dissonant here: if it's impossible on a Mac to save files
with non-UTF8 filenames, then maybe we should adjust the sources to
assume this to be the situation.

> IMHO the
> current behavior of Subversion/C(++) can't be "right", the transformation
> should be done out-of-the-box (without a change in the system
> configuration),

Normally, this is impossible, for the reasons mentioned above.

> because it might be unclear to the user, what the "right"
> system locale is, especially when working with file names using characters
> from different languages (e.g. umlauts, accents or even Arabian/Asian
> characters).
>
> Second, after adding/committing files in the decomposed form into the
> repository, there needs to be some possibility to fix (decomposed) file
> names *in* the repository, so the users don't have to work with decomposed
> AND composed forms after Subversion supports storing files in exactly one
> form in the repository.

As you said, that's step 2, because for now, we have no way to make it
work on both Mac and other OSes, meaning that the fix would be
useless.

bye,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Jul 16 17:25:24 2007

This is an archived mail posted to the Subversion Dev mailing list.