Re: Let's discuss about unicode compositions for filenames!

From: Hiroaki Nakamura <hnakamur_at_gmail.com>
Date: Fri, 3 Feb 2012 05:33:02 +0900

2012/2/3 Daniel Shahaf <danielsh_at_elego.de>:
> Branko Čibej wrote on Thu, Feb 02, 2012 at 21:03:47 +0100:
>> On 02.02.2012 20:22, Peter Samuelson wrote:
>> > [Hiroaki Nakamura]
>> >> In option (2), we do n12n on all clients on all platforms, and we
>> >> include web_dav_svn in "clients". So we convert all input paths to
>> >> the "server encoding", which is NFC.
>> > Indeed. But the very concept of a "server encoding" means we are
>> > involving the server side. Which invokes a lot of difficult questions
>> > like "what about existing 1.x clients", "what about existing checkouts"
>> > and "what about existing repositories".
>> >
>> > By proposing a client-only solution, I hope to avoid _all_ those
>> > questions.
>>
>> Can't see how that works, unless you either make the client-side
>> solution optional, create a mapping table, or make name lookup on the
>> server agnostic to character representation. I can't envision how any of
>> those solutions would work all the time.
>>
>> It would be nice if we could normalize paths in the repository without
>> having to perform a dump/reload cycle, but I don't know how that would
>> work in FSFS
>
> It won't. Changing the encoding increase the length (in bytes) of the
> string (in the dirents hash, for example), and thus change the offsets
> of the node-revs that are later in the file --- to which subsequent
> revisions, and the id's of those node-revs, refer.

Changes from NFD to NFC does not increase the length.
The length will be same or smaller, not larger.

Here I quote from
http://svn.apache.org/repos/asf/subversion/trunk/notes/unicode-composition-for-filenames
> The proposed internal 'normal form' should be NFC, if only if
> it were because it's the most compact form of the two: when
> allocating memory to store a conversion result, it won't be
> necessary (ever) to allocate more than the size of the input buffer.

-- 
)Hiroaki Nakamura) hnakamur_at_gmail.com

Received on 2012-02-02 21:33:36 CET

This message: [ Message body ]
Next message: Peter Samuelson: "Re: Let's discuss about unicode compositions for filenames!"
Previous message: Peter Samuelson: "Re: Let's discuss about unicode compositions for filenames!"
In reply to: Daniel Shahaf: "Re: Let's discuss about unicode compositions for filenames!"
Next in thread: Daniel Shahaf: "Re: Let's discuss about unicode compositions for filenames!"
Reply: Daniel Shahaf: "Re: Let's discuss about unicode compositions for filenames!"
Reply: Julian Foad: "Re: Let's discuss about unicode compositions for filenames!"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]