[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC] Unicode character encoding for other-than-filenames?

From: David Glasser <glasser_at_davidglasser.net>
Date: Wed, 2 Apr 2008 19:39:57 -0700

On Wed, Apr 2, 2008 at 3:54 PM, Branko Èibej <brane_at_xbc.nu> wrote:
> Erik Huelsmann wrote:
>
> > Working on the NFC/NFD awareness of the Subversion client, I soon
> > realised this issue also affects URLs.
> >
> > However, many of the textual identifiers we use in the client (such as
> > changelists) are also UTF-8 encoded. How should Subversion behave if a
> > user sets a changelist with NFC encoded characters and later (somehow)
> > tries to retrieve that same changelist using NFD encoded characters?
> > Giving an error message with the changelist name will look strange to
> > the user: the changelist identifier looks exactly the same to the
> > user.
> >
> > So, do we have to do Unicode-aware string comparison for
> > other-than-filename-identifiers? If so, which ones?
> >
> >
>
> We only have to normalize keys in the repository, which means filenames.
> Everything else is either not indexed, or stored only locally in the WC. If
> it's local, and the user or OS magically changes the normalization during
> the lifetime of the WC ...

Or, well, to expand: we only have to normalize keys in the repository
(because they are shared by multiple clients), or things that we let
the OS reinterpret for us (like filenames).

--dave

-- 
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
Received on 2008-04-03 04:40:07 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.