[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Let's discuss about unicode compositions for filenames!

From: Daniel Shahaf <danielsh_at_elego.de>
Date: Thu, 2 Feb 2012 22:25:16 +0200

Branko Čibej wrote on Thu, Feb 02, 2012 at 21:03:47 +0100:
> On 02.02.2012 20:22, Peter Samuelson wrote:
> > [Hiroaki Nakamura]
> >> In option (2), we do n12n on all clients on all platforms, and we
> >> include web_dav_svn in "clients". So we convert all input paths to
> >> the "server encoding", which is NFC.
> > Indeed. But the very concept of a "server encoding" means we are
> > involving the server side. Which invokes a lot of difficult questions
> > like "what about existing 1.x clients", "what about existing checkouts"
> > and "what about existing repositories".
> >
> > By proposing a client-only solution, I hope to avoid _all_ those
> > questions.
>
> Can't see how that works, unless you either make the client-side
> solution optional, create a mapping table, or make name lookup on the
> server agnostic to character representation. I can't envision how any of
> those solutions would work all the time.
>
> It would be nice if we could normalize paths in the repository without
> having to perform a dump/reload cycle, but I don't know how that would
> work in FSFS

It won't. Changing the encoding increase the length (in bytes) of the
string (in the dirents hash, for example), and thus change the offsets
of the node-revs that are later in the file --- to which subsequent
revisions, and the id's of those node-revs, refer.

> (BDB would be fairly easy, modulo collisions, but I don't
> think those are very likely).
>
> -- Brane
>
Received on 2012-02-02 21:26:00 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.