[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: UTF-8 support on Mac?

From: Ryan Schmidt <subversion-2007b_at_ryandesign.com>
Date: Tue, 5 Feb 2008 19:17:42 -0600

On Feb 5, 2008, at 10:59, B. Blodau wrote:

> Hello out there,
> I'm encountering a problem when calling svn_client_commit3() on the
> Mac.
>
> The name of the file to be committed contains a non-ASCII
> character, in this case an 'ä' (a-umlaut). This is perfectly
> encoded to UTF-8 but not as a precomposed character (0xc3, 0xa4)
> but as a normalized character consisting of the base character 'a'
> plus a following combining character '¨' (COMBINING DIAERESIS).
> So the UTF-8 byte sequence is: 0x61, 0xcc, 0x88.
>
> When calling svn_client_commit3(), I'm getting the error message:
> "Can't convert string from 'UTF-8' to native encoding." from
> "subversion/libsvn_subr/utf.c".
>
> I'm a bit irritated because the documentation for svn_client_commit3
> () says:
> "targets is an array of const char* paths to commit. They need not
> be canonicalized nor condensed; this function will take care of that."
>
> Does svn support non-ASCII characters on the Mac?
> Does svn support non-ASCII characters in their normalized form on
> the Mac?
> Can I do anything to get this working? ;)
>
> Since other clients can handle such an umlaut, it might be that svn
> expects precomposed characters?

I'm not sure about the error message you encountered specifically.
However there is a problem that Subversion seems not canonicalize the
UTF-8 representation of file names in any way. It just stores the
UTF-8 bytes the way they come in from the client. On other operating
systems this seems not to be a problem because they don't care
whether the UTF-8 sequences are composed or decomposed, but because
Mac OS X (or rather HFS+) does canonicalize UTF-8 representations to
the decomposed form, and because other operating systems seem to make
filenames in composed form, Mac users using Subversion repositories
containing non-ASCII characters tend to run into this problem:

http://subversion.tigris.org/issues/show_bug.cgi?id=2464

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: users-help_at_subversion.tigris.org
Received on 2008-02-06 02:18:18 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.