[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

a Unicode issue and a Mac character encoding issue

From: Erik Huelsmann <ehuels_at_gmail.com>
Date: 2007-07-17 20:33:12 CEST

There were 2 issues on the mailing list which were too intermixed to
make sense for most people. They got mixed up because they both have a
tendency to show up on Mac OS X. This mail is about getting them apart
and explaining the first of the two. The second issue is more
elaborate and needs explaining in a separate mail.

The two issues are:
1) Under Mac OS X, most applications work flawless when the locale
system hasn't been completely set up, but not so Subversion: svn won't
read data off the disk until it has.

2) Standardizing internally on UTF-8 alone isn't enough to free us
from character encoding issues. (Yet this is what we did.) This
problem shows up on the Mac (especially in a mixed systems
environment) because the Mac standardized on a different Unicode
normal form than Linux/Windows.

A separate mail is to come about item (2), the rest of this mail is
about item (1).

Users have been complaining that it is weird that svn requires a fully
set up locale system, since paths are by requirement encoded in UTF-8
on Darwin (their words, not mine).

The reason we require the locale system to be set up is because APR
tells us so: APR says the filesystem path encoding should be gotten
from the locale system. This however contradicts the said requirement
of UTF-8 encoded filenames.

I tried to find a document explaining/documenting about this
requirement, but alas I've not been able to (yet). Test results would
help a lot here too: is it possible to set the locale to something
non-UTF8 compliant (latin1) and end up with non-UTF8 files in the
filesystem? Or will Mac OS X tell us to deliver UTF8 filenames even
though the locale is latin1?

Anyway, I contacted dev@apr and was told the current behaviour is
intentional and matches all other Unices. True as that may be, if all
path names are UTF8 always, then it's not as Unix as most others...

So, why do I want this issue to be fixed? Well, it's a BFI (Bad First
Impression) for a large number of Mac subversion users...

I'm hoping someone can help me with the datapoints I need [I can't
myself, not being a Mac owner].

bye,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Jul 17 20:32:23 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.