Re: Evil UTF-8 Character in filename in repo causing issues on my wc
From: Ryan Schmidt <subversion-2011a_at_ryandesign.com>
Date: Wed, 15 Jun 2011 01:39:30 -0500
On Jun 14, 2011, at 18:59, Stefan Sperling wrote:
> On Tue, Jun 14, 2011 at 04:24:46PM -0700, Geoff Hoffman wrote:
I would clarify this by saying the problem is that Subversion assumes that a filename submitted in one version of UTF-8 encoding will always stay in that version of UTF-8 encoding, and on the HFS+ filesystem, used by Mac OS X, that assumption is not necessarily true. (It normalizes all UTF-8 filenames to decomposed form.) Subversion would happily allow you to create two filenames that humans would consider identical (one with UTF-8 entities composed, one with UTF-8 entities decomposed). So clearly that's a bug in Subversion (or possibly apr or apr-util); it should normalize UTF-8 strings before running comparisons. It also seems like a bug in Windows and Linux filesystems; I assume they also let you create multiple files whose names look identical (but differ only in the composition of their UTF-8 characters). Mac OS X's is the only filesystem I know of that has fixed this bug -- which therefore exposes the problem when collaborating between Mac OS X systems (which have the fix) and other systems (which do n Using only ASCII characters in your filenames is one way to combat the problem. This strategy works fine for me, but users not using primarily English might find that harder. If you want to continue using UTF-8 characters in filenames, you can get a version of Subversion for Mac OS X that attempts to work around this problem, by installing MacPorts and then running:
sudo port install subversion +unicode_path
The patch the +unicode_path variant applies is of course not officially supported.
|
This is an archived mail posted to the Subversion Users mailing list.
This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.