[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Mac OS X: problems adding files with umlauts

From: Wilfredo Sánchez Vega <wsanchez_at_wsanchez.net>
Date: 2006-07-07 17:22:53 CEST

   My point was that LC_ALL is *not* relevant to decoding filenames,
because environment variables have nothing to do with how filenames
were encoded.

   And I was saying that on Mac OS X, one can reasonably expect files
to be encoded in UTF-8, because that is what Apple tells developers
to do, and most comply. On most Unix systems, there is no way to
know what encoding was used for a filename, and most developers
assume they can only (safely) use 7-bit ASCII for decoding; all other
characters are typically considered "unprintable". But on Mac OS X,
UTF-8 the recommended encoding.

   However, there exist byte sequences which are not valid UTF-8
strings, and yet it is possible to name a file with such a byte
sequence. In that case, an attempt to decode the filename assuming a
UTF-8 encoding will fail. I would not expect that to happen with any
file that a user gives a name to, but such a situation may happen if
software is generating filenames (eg. using some internal
identifier), since using UTF-8 in filenames isn't an enforced
requirement on most filesystems.

        -wsv

On Jul 7, 2006, at 12:02 AM, Thomas Singer wrote:

>> That said, it is possible to write file names containing bytes
>> that can't decode as UTF-8.
>
> I can't believe that. Could you please give an reproducible example?
>
>> I think LC_ALL is relevant to what the encoding of svn's output
>> should be.
>
> I'm sure, you mixed here two things: the file names and the output.
> File names should be always convertible to a general character
> representation like UTF-8. Displaying the file names with the right
> sign in the output is a different issue and might depend on the
> used font.
>
> If you think, LC_ALL should be relevant for the file name detection
> in Subversion, could you give answers for the following questions:
> - What LC_ALL-value the user should set?
> - What should happen when the wrong value was set?
> - What value to set for file names in different languages?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Fri Jul 7 17:25:29 2006

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.