On 29 nov 2013, at 21:09, Branko ÄŒibej <brane_at_wandisco.com> wrote:
> On 29.11.2013 20:42, Ivan Zhakov wrote:
>> On 29 November 2013 22:22, <brane_at_apache.org> wrote:
>>> Author: brane
>>> Date: Fri Nov 29 18:22:00 2013
>>> New Revision: 1546619
>>>
>>> URL: http://svn.apache.org/r1546619
>>> Log:
>>> * branches/fsfs-ucsnorm/BRANCH-README: New file.
>>>
>>> Added:
>>> subversion/branches/fsfs-ucsnorm/BRANCH-README (with props)
>>>
>>> Added: subversion/branches/fsfs-ucsnorm/BRANCH-README
>>> URL: http://svn.apache.org/viewvc/subversion/branches/fsfs-ucsnorm/BRANCH-README?rev=1546619&view=auto
>>> ==============================================================================
>>> --- subversion/branches/fsfs-ucsnorm/BRANCH-README (added)
>>> +++ subversion/branches/fsfs-ucsnorm/BRANCH-README [UTF-8] Fri Nov 29 18:22:00 2013
>>> @@ -0,0 +1,66 @@
>>> +The purpose of this [fsfs-ucsnorm] branch is to implement two optional
>>> +checks related to Unicode normalisation to FSFS.
>>> +
>>> +
>>> +Option: Prevent name collisions
>>> +===============================
>>> +
>>> +If this option is enabled, FSFS will reject operations that would
>>> +create two different representations of the same name in the same
>>> +directory. This will prevent situations where a user could see more
>>> +than one form of the name in a directory listing:
>> Nice feature, but why in FS layer? May be it's better to implement
>> this feature on svn_repos layer?
>
> It's not, for at least two reasons:
> Users of the FS API must have the same constraints as repository clients, otherwise the whole thing falls on its face.
> The repos layer cannot implement this optimally; at a rough guess, it would have to double the number of lookups performed:
> The node cache in an FSFS implementation detail, and this option will affect how cache keys are generated.
> Likewise for actual lookups into the on-disk representation.
Just want to say that, in my opinion, the design described in BRANCH-README since r1546640 looks very good.
You might remember from back when I did some specification work (in the wiki) that I am a strong proponent of the "normalization-preserving" approach to the problem. I believe n-p makes many issues dealing with existing repositories much easier to manage, in most cases go away completely unless there are actually normalization conflicts. E.g. the issue raised by Bert 2013-11-24 regarding mergeinfo is not a problem with n-p (I guess without thinking too much about it).
Thanks for working on normalization,
/Thomas Ã….
Received on 2013-12-07 19:27:04 CET