[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn checkout - special characters in file name are not encoding properly

From: Vincent Lefevre <vincent-svn_at_vinc17.net>
Date: Thu, 12 Aug 2010 10:14:58 +0200

On 2010-08-12 09:59:30 +0200, Csaba Raduly wrote:
> On Wed, Aug 11, 2010 at 4:49 PM, Michael Pruemm wrote:
> > Vincent Lefevre wrote:
> (snip)
> >> Under these conditions, the only possibility is
> >> to encode the filenames in UTF-8 anyway. So, why not enforcing
> >> that?
> >>
> >
> > But don't forget that different platforms may use different UTF-8 encodings
> > for the same filename.
>
> Huh? There's only one UTF-8 encoding for each Unicode code point. Are
> you thinking of code pages?

Michael means that there are several ways to represent a "same"
string (from a semantic point of view). There are two normalized
representations: NFC and NFD. While Linux does not try to normalize
filenames (they are just viewed as a sequence of bytes[*]), Mac OS X
(at least with HFS+) requires that the filenames are valid UTF-8
strings (even in non-UTF-8 locales) and normalize them to NFD for
storing them on disk.

[*] The locale doesn't matter, and top-bit-set bytes are allowed and
can be handled even in ASCII-based locales.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
Received on 2010-08-12 10:15:37 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.