[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Problems with accents in filenames

From: Vincent Lefevre <vincent+svn_at_vinc17.org>
Date: 2003-11-22 12:35:33 CET

On 2003-11-21 14:23:34 -0700, Jani Averbach wrote:
> On 2003-11-21 21:58+0100, Vincent Lefevre wrote:
> > bash doesn't support non-ASCII charsets for filenames.
>
> $ export LC_CTYPE=fi_FI.ISO8859-1
> $ touch ääkkönen.txt
> $ ls -l ääkkönen.txt
> -rw-r--r-- 1 jaa jaa 0 2003-11-21 14:08 ääkkönen.txt
>
> $ export LC_CTYPE=fi_FI.UTF-8
> $ touch ääkkönen_2.txt
> $ ls -ltr
> -rw-r--r-- 1 jaa jaa 0 2003-11-21 14:08 ??kk?nen.txt
> -rw-r--r-- 1 jaa jaa 0 2003-11-21 14:15 ääkkönen_2.txt
>
> $ export LC_CTYPE=C
> $ ls -ltr
> -rw-r--r-- 1 jaa jaa 0 2003-11-21 14:08 ??kk?nen.txt
> -rw-r--r-- 1 jaa jaa 0 2003-11-21 14:15 ????kk??nen_2.txt

This is what I've said: bash doesn't support non-ASCII charsets for
filenames, as you can see on this example; it just sends raw 8-bit.

> > You can also just think that some user A in a UTF-8 locale creates
> > a file and some user B in an ISO-8859-1 locale wants to read the
> > file...
>
> Then you just lose?

With bash, yes. But with software that supports accented characters,
you don't. And such software does exist (e.g. ROX-Filer): whatever
the user's choice for the locales is, all users will get the same
characters.

> I don't understand your point, there is no universal agreement what
> is the default conversion for the charset of filenames, so how could
> subversion mandate that?

There seems to be a universal agreement amongst software that supports
non-ASCII charsets: always encode in UTF-8. If subversion doesn't do
that, it will break such software (bash is already broken concerning
non-ASCII charsets, as two users won't be able to share correctly the
same files if they use different locales).

-- 
Vincent Lefèvre <vincent_at_vinc17.org> - Web: <http://www.vinc17.org/> - 100%
validated (X)HTML - Acorn Risc PC, Yellow Pig 17, Championnat International
des Jeux Mathématiques et Logiques, TETRHEX, etc.
Work: CR INRIA - computer arithmetic / SPACES project at LORIA
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Nov 22 12:36:20 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.