[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Evil UTF-8 Character in filename in repo causing issues on my wc

From: Vincent Lefevre <vincent-svn_at_vinc17.net>
Date: Wed, 22 Jun 2011 15:42:42 +0200

On 2011-06-15 12:29:37 +0200, Stefan Sperling wrote:
> Unicode, and it's quirk of allowing the *same* character to be encoded
> in *different* ways, came much later.
> I think it is unfortunate that Apple broke with the concept that a
> filename is just a string of bytes.

It's also unfortunate that Subversion breaks this concept too. :)

I mean: do a checkout of a repository containing non-ASCII characters
under Linux. Then change the locales (e.g. ISO-8859-1 -> UTF-8). Do
an update. And see the errors...

> When they made this decision they probably considered that it might break
> applications and decided that the applications would have to adjust.

One problem is that different applications encode accented characters
(typed on the keyboard) differently: some of them use NFC, others use
NFD. If they aren't regarded as equivalent, you get problems. And
since Unicode doesn't standardize which one to use, one cannot blame
the applications.

Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)
Received on 2011-06-22 15:43:15 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.