[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC] Unicode character encoding for other-than-filenames?

From: Branko Čibej <brane_at_xbc.nu>
Date: Thu, 03 Apr 2008 09:28:45 +0200

David Glasser wrote:
> On Wed, Apr 2, 2008 at 3:54 PM, Branko Čibej <brane_at_xbc.nu> wrote:
>> Erik Huelsmann wrote:
>>> Working on the NFC/NFD awareness of the Subversion client, I soon
>>> realised this issue also affects URLs.
>>> However, many of the textual identifiers we use in the client (such as
>>> changelists) are also UTF-8 encoded. How should Subversion behave if a
>>> user sets a changelist with NFC encoded characters and later (somehow)
>>> tries to retrieve that same changelist using NFD encoded characters?
>>> Giving an error message with the changelist name will look strange to
>>> the user: the changelist identifier looks exactly the same to the
>>> user.
>>> So, do we have to do Unicode-aware string comparison for
>>> other-than-filename-identifiers? If so, which ones?
>> We only have to normalize keys in the repository, which means filenames.
>> Everything else is either not indexed, or stored only locally in the WC. If
>> it's local, and the user or OS magically changes the normalization during
>> the lifetime of the WC ...
> Or, well, to expand: we only have to normalize keys in the repository
> (because they are shared by multiple clients), or things that we let
> the OS reinterpret for us (like filenames).

Ah, indeed. I just conflated the two since we happen to use filenames in
both contexts.

-- Brane

To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-04-03 09:29:46 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.