[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

MacOSX filename encoding issue

From: Martin Hauner <martin.hauner_at_gmx.net>
Date: 2006-04-22 19:40:44 CEST

Hi,

while fixing "svn: Can't convert string from native encoding to 'UTF-8':"
errors in subcommander when using filenames with extended characters on
MacOSX I noticed some strange behaviour that is reproducable with the
svn command line tool (1.3.0).

First thing that i have to do is set LANG so svn works at all. Without
it svn complains with the above error.

setlocale(LC_ALL, "") doesn't seem to work on MacOSX if LANG isn't set.

First I'm using DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ONE
(utf16: 278A, utf8: E2 9E 8A)

$ svn mkdir ➊
A ➊

$ svn st
A ➊

This is as expected, now another character, the german umlaut ö.

ö (utf16: 00F6, utf8: C3 B6)

$ svn mkdir ö
A ö
$ svn st
? ö
! ö

This is unexpected. It looks like that status gets a different filename
when it reads the dir and thinks that the new dir is missing and that
there is an unversioned item of the same name.

Then entries file in .svn looks good.

Looking at the output of ll -B (works only with LANG unset) shows that
svn is really getting something different:

drwxr-xr-x 3 hauner hauner 102 Apr 22 15:30 o\314\210
drwxr-xr-x 3 hauner hauner 102 Apr 22 18:11 \342\236\212

the second line is digit one and converting the numbers to hex delivers
its utf8 code. What should be the ö is something differnt (o + cc 88,
where cc 88 is a character with two dots: COMBINING DIAERESIS).

I'm no unicode expert but i guess a 100% unicode compatible program
(for example a text editor) would combine the o with COMBINING DIAERESIS
to display it as a single ö character?

Now the question is (assuming my analysis is correct) if it is possible
to workaround this strange behaviour of the Mac filesystem?

It would be nice if there were a combining aware utf8strcmp that could
be used by svn. I don't know how hard it would be to write such a
function.

-- 
Martin
Subcommander, http://subcommander.tigris.org
a cross platform Win32/Unix/MacOSX subversion GUI client & diff/merge tool.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Apr 22 19:41:26 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.