On Feb 13, 2008, at 09:14, Erik Huelsmann wrote:
>>>> This is broken. APR should switch to UTF-8 locales internally
>>>> when it
>>>> deals with filenames (like what GNOME apps do). Otherwise this
>>>> leads
>>>> to consistency problems when the user has both ISO-8859-1 and UTF-8
>>>> terminal sessions (the reason is that some applications and/or some
>>>> machines do not support multibyte character sets, and one wouldn't
>>>> want to mess everything when running svn in degraded mode, i.e.
>>>> with
>>>> ISO-8859-1 locales).
>>>
>>> No. The way (non-Mac) unices deal with this is seriously broken.
>>> There
>>> is *no* guarantee the actual input paths are the encoding claimed by
>>> the locale settings.
>>>
>>> There is no way for APR to solve that issue. The only thing it
>>> can do
>>> is tell the application which input it should expect. Subversion
>>> offers conversion routines to do the actual "locale"->UTF8 path
>>> conversion since Subversion actually *is* UTF8 "inside", meaning
>>> that
>>> it's ok for Subversion to err when it encounters invalid (ie non-
>>> UTF8)
>>> input. Not all APR applications may find that desirable (for
>>> example:
>>> Apache httpd doesn't initialise locale settings, so, it can't do
>>> locale->utf8 conversions [as the C runtime doesn't know what the
>>> current locale is]; nor will it change that behaviour.)
>>
>> It's worse. SVN doesn't get it right either since it's ignorant of
>> unicode
>> normalization forms [1].
>
> Well, yes and no :-) Subversion depends (more so than, say, /bin/ls)
> on a sanely configured environment (locale on disk == locale in
> terminal, locale configured in the first place, etc). This is fine,
> since Subversion needs to operate accross different configurations and
> even OSes (whereas /bin/ls does not).
Hold up for a second... I'm havin' a little trouble...
> locale on disk
What is this? I know that on my Mac, I use the HFS+ filesystem which
stores filenames in UTF-16. But that's a character encoding, and it's
not configurable; it's an integral part of the HFS+ specification.
Are you saying there's also an associated locale in the filesystem? I
don't think I've ever been asked to set one, and I don't know how I
would do so nor how I would figure out what it's set to now...
> == locale in terminal,
This is the locale I know about. "LANG=en_US.UTF-8" and so forth.
When I do this, Terminal knows how to display filenames from the disk
correctly because it converts the UTF-16 characters on disk into
UTF-8 characters for display in Terminal. Similarly, various commands
like ls and svn know that I want them to output UTF-8 characters to
the terminal.
> locale configured in the first place
What is this? What is "in the first place"? Is that when I first
checked out a working copy? when I first made a repository? when I
first installed Subversion? when I first installed the OS?
It sounded like Vincent was saying that if a working copy is created
under one terminal locale setting, but then accessed with a different
terminal locale setting, things don't work right.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: users-help_at_subversion.tigris.org
Received on 2008-02-13 21:26:28 CET