[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Problems with accents in filenames

From: Vincent Lefevre <vincent+svn_at_vinc17.org>
Date: 2003-11-24 01:40:52 CET

On 2003-11-23 23:38:53 +0100, Branko ??ibej wrote:
> There is no convention on Unix about how the filesystem encodes file
> names. Typically, the bytes that an application send to the VFS layer
> are what gets written to disk. And the exact encoding of the characters
> in a file name depends on the current locale.

No, it depends on what the software gives to open(2). If the software
chooses to encode the filename in UTF-8 (even if the current locales
are not UTF-8 ones), you'll get UTF-8 encoding on the file system.

> Yes, different users can use different locales, and the same user can
> use different locales at different times. As you noticed, this will
> typically cause problems if the locales used are incompatible (such as
> in your example, where UTF-8 and ISO-8859-1 are incompatible for code
> positions above \x7f).

No, this won't cause any problem if the UTF-8 encoding is always chosen.

> The bytes stored in the directory structure on disk are always
> interpreted in terms of the current locale settings.

This is a contradiction with your first sentence saying that there is
no convention about how the filesystem encodes file names.

> This is what you're seing, and it's not a Subversion bug, it's a fact of
> life on Unix.

It is an inconsistency, therefore a Subversion bug (even if Subversion
chooses a local encoding -- for instance, in this case, the encoding
could be written somewhere in the .svn directory, and there would be
no bug).

> I agree it would be nice if Unix file systems stored character data in a
> consistent encoding (such as, for example, NTFS on Windows, which uses
> UTF-16), but things simply don't work that way. Please stop trying to
> convince us that they do.

ROX-Filer (and GNOME applications) chose to encode the filenames in
UTF-8. If you are not convinced, try...

Vincent Lefvre <vincent_at_vinc17.org> - Web: <http://www.vinc17.org/> - 100%
validated (X)HTML - Acorn Risc PC, Yellow Pig 17, Championnat International
des Jeux Mathmatiques et Logiques, TETRHEX, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Nov 24 01:41:34 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.