[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Let's discuss about unicode compositions for filenames!

From: Hiroaki Nakamura <hnakamur_at_gmail.com>
Date: Fri, 3 Feb 2012 05:33:02 +0900

2012/2/3 Daniel Shahaf <danielsh_at_elego.de>:
> Branko Čibej wrote on Thu, Feb 02, 2012 at 21:03:47 +0100:
>> On 02.02.2012 20:22, Peter Samuelson wrote:
>> > [Hiroaki Nakamura]
>> >> In option (2), we do n12n on all clients on all platforms, and we
>> >> include web_dav_svn in "clients". So we convert all input paths to
>> >> the "server encoding", which is NFC.
>> > Indeed.  But the very concept of a "server encoding" means we are
>> > involving the server side.  Which invokes a lot of difficult questions
>> > like "what about existing 1.x clients", "what about existing checkouts"
>> > and "what about existing repositories".
>> >
>> > By proposing a client-only solution, I hope to avoid _all_ those
>> > questions.
>>
>> Can't see how that works, unless you either make the client-side
>> solution optional, create a mapping table, or make name lookup on the
>> server agnostic to character representation. I can't envision how any of
>> those solutions would work all the time.
>>
>> It would be nice if we could normalize paths in the repository without
>> having to perform a dump/reload cycle, but I don't know how that would
>> work in FSFS
>
> It won't.  Changing the encoding increase the length (in bytes) of the
> string (in the dirents hash, for example), and thus change the offsets
> of the node-revs that are later in the file --- to which subsequent
> revisions, and the id's of those node-revs, refer.

Changes from NFD to NFC does not increase the length.
The length will be same or smaller, not larger.

Here I quote from
http://svn.apache.org/repos/asf/subversion/trunk/notes/unicode-composition-for-filenames
> The proposed internal 'normal form' should be NFC, if only if
> it were because it's the most compact form of the two: when
> allocating memory to store a conversion result, it won't be
> necessary (ever) to allocate more than the size of the input buffer.

-- 
)Hiroaki Nakamura) hnakamur_at_gmail.com
Received on 2012-02-02 21:33:36 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.