[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: svn commit: r1408325 - /subversion/branches/wc-collate-path/subversion/libsvn_subr/sqlite.c

From: Bert Huijben <bert_at_qqmail.nl>
Date: Mon, 12 Nov 2012 17:11:38 +0100

> -----Original Message-----
> From: brane_at_apache.org [mailto:brane_at_apache.org]
> Sent: maandag 12 november 2012 16:37
> To: commits_at_subversion.apache.org
> Subject: svn commit: r1408325 - /subversion/branches/wc-collate-
> path/subversion/libsvn_subr/sqlite.c
>
> Author: brane
> Date: Mon Nov 12 15:36:47 2012
> New Revision: 1408325
>
> URL: http://svn.apache.org/viewvc?rev=1408325&view=rev
> Log:
> On the wc-collate-path branch: Enable GLOB and LIKE operator
> replacements.

Completely unrelated to this patch, but I'm still wondering what your total approach/plan on this branch will be.

I can see that we handle this collate in sqlite (even though this breaks using a plain sqlite3 as tool on wc.db, etc.), but the notes/unicode-composition-for-filenames describes several other problems that need a fix at the same time in order not to break at least some current subversion users.

One of these things is that we use hashtables to represent all nodes in a directory in several places. In some cases we get this from the working copy, in some cases from the db and in even other cases from the repository. Some of these may be normalized in some way, while others are not (especially with our compatibility guarantees within 1.X)

I'm afraid that just getting wc.db compatible with normalization will just shift the problem one layer, while still not fixing the real problem. Erik Huelsmann thoroughly investigated this problem space some years ago and he documented that fixing the wc library is not enough for fixing the generic case. And if we are not fixing the generic case, I'm wondering if we should really work on a major slowdown of every common operation.

We currently have a binary format, that can be used as a hash key, so many comparison and lookup operations are constant time.
I'm not sure how they are after installing the collate handling.

If we leave the generic case, there are easier ways to resolve this issue. One such thing would be to make apr (or a wrapper in Subversion) normalize the on disk paths in the other direction and deny (on the server) the non-normalized paths. This would eliminate the slowdown on most use cases that don't have a problem right now, and keep the code clean for future problems.

If we have to check for collate handling everywhere in libsvn_wc and libsvn_client we make it much harder for outside developers to create patches and even fewer core subversion developers would dare touch these layers.

I'm glad somebody is finally looking into these issues, but I think we should look at the full picture before we can talk about getting this back on trunk.

        Bert
Received on 2012-11-12 17:12:21 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.