[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1408325 - /subversion/branches/wc-collate-path/subversion/libsvn_subr/sqlite.c

From: Thomas Åkesson <thomas_at_akesson.cc>
Date: Sun, 20 Jan 2013 21:15:31 +0100

First of all, I am really sorry that I did not observe this thread while ongoing. Due to time constraints, my contributions to Subversion happens now and then.

I have spent quite a bit of time writing the wiki pages, experimenting, and discussing with the people who have shown interest (Branko, Julian, Ben and a couple of others in person at Subversion Live, London). I am very saddened to see this negative attitude towards resolving this long standing issue. No doubt, most Mac OS X users with non-ASCII languages are too.

Branko, if you can summarize your findings in the collation experiments and what parts actually made it back to trunk, I would find that very interesting. I would like to do further experiments if possible.

Responding to some of Bert's concerns below.

On 12 nov 2012, at 18:34, "Bert Huijben" <bert_at_qqmail.nl> wrote:

>> -----Original Message-----
>> From: Branko Čibej [mailto:brane_at_wandisco.com]
>> Sent: maandag 12 november 2012 17:49
>> To: dev_at_subversion.apache.org
>> Subject: Re: svn commit: r1408325 - /subversion/branches/wc-collate-
>> path/subversion/libsvn_subr/sqlite.c
>>
>> It's all described and discussed here:
>>
>> http://wiki.apache.org/subversion/UnicodeComposition
>>
>> This branch is only exploring the client-side effects. The server needs
>> to adjust to make the whole thing bullet-proof.
>
> I don't see a discussion of and/or answers to many questions in http://svn.apache.org/repos/asf/subversion/trunk/notes/unicode-composition-for-filenames in there.

Please add a reference to the wiki page in this note. The wiki to large extent supersedes the note, and references back.

>
> The most important: How are you going to handle the current hashtable approach in performance critical things like 'svn status'?
>
> [I don't think a WIKI is the right place to discuss such topics, but that is a different topic]

I think the wiki is a great place for collaborative design. I wrote stuff in the wiki, and then posted to the list for feedback. Some people did respond with feedback...

Most of the information is in the page linked from the wiki page mentioned above:
http://wiki.apache.org/subversion/NonNormalizingUnicodeCompositionAwareness

It discusses pros and cons of repository normalization. The design predates the collation idea.

>
> This involves a solution for how you are going to handle duplicate names. Many existing users only find these problems after committing a problematic file. In many cases they will remove that file and maybe add the same name with a different encoding. A mixed revision working copy (or an svn up from one to the other) can then have both files.

The wiki page actually does discuss this. It can not be fully resolved for Mac OS X users without Svn 1.x compatibility issues, but we can move subversion from "completely unusable" to "usable from a certain revision and forward". That would be a great step improvement.

Please do provide feedback on which cases are not covered in the wiki.

>
> A normalization library and the right collate indexes won't resolve those problems.
> I don't think we can just apply a UNIQUE constraint or something without breaking compatibility?

The wiki article proposes that we introduce "normalization-uniqueness". I think very contrived use cases are needed to oppose that.

> I would have hoped to see an explanation on what you are trying to resolve in a BRANCH-README or in the Wiki.
> And given the information in 'unicode-composition-for-filenames' I don't see a libsvn_wc only solution to these issues.

No, as noted in wiki.

Again, sorry for not noticing the thread earlier,
/Thomas Å.
Received on 2013-01-20 21:16:05 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.