[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

A modest proposal: No index or "log -g" in Subversion 1.5

From: David Glasser <glasser_at_davidglasser.net>
Date: 2007-11-30 02:12:08 CET

I have been experimenting with an alternate backend for svn:mergeinfo
data over the last day. I have come to the conclusion that by
delaying one feature to 1.6 (which was originally proposed as a 1.6
feature anyway), we can vastly simplify the svn:mergeinfo backend,
remove some pretty difficult bugs, and get a satisfactory 1.5 released
relatively soon.

Specifically, my experiments have taught me a few things:

* Almost all of merge-tracking on trunk now requires no index at all;
  most of the queries are just trying to look up svn:mergeinfo on a
  specific path at a specific revision, which is exactly what the FS
  itself does.

* ... except that the FS itself handles things like "copying a node
  copies everything below it" and "deleting a node deletes everything
  below it"; our current sqlite code does not handle that, making it
  easy to corrupt the index. And implementing that for the sqlite
  code would be tantamount to a complete Subversion-DAG-FS model
  implemented for our index.

* The only command that requires a more sophisticated query against
  the index is "svn log -g", which essentially needs to do the query
  "at revision R, what are all the paths under P that have mergeinfo?"

* I have a completely working implementation for FSFS that keeps
  enough metadata in the DAG itself to answer that question
  efficiently. So we really don't need to use sqlite for that.
  It would probably require a db format bump, but that's not too big a
  deal (and wouldn't really need a dump/load; I can give more details
  if you want). Hopefully the BDB implementation wouldn't be hard
  either.

* My implementation does a little more error-checking than the sqlite
  implementation; specifically, the sqlite implementation didn't care
  if you asked for mergeinfo about paths that don't exist, whereas
  mine does (though it could suppress that error, of course). That
  extra checking is already showing me a bunch of bugs all throughout
  the client code, and especially in "log -g", where they're passing
  in the wrong paths.

* Kamesh's issue-2897 branch would require more sophisticated queries.
  (And in fact I think that those queries might enable "log -g" to do
  its job better.) But it's controversial whether or not we should
  try to get issue 2897 in for 1.5; it's a big, big problem with no
  simple answer. In addition, this would require us to fix the
  serious bugs in the sqlite index mentioned above.

I would like to propose the following:

* We do not attempt to solve Issue 2897 for 1.5. It is probably
  possible to solve it, but it will take a lot of work, have lots of
  subtleties, etc.

* We disable "log -g" for 1.5. "log -g" was originally proposed as a
  1.6 command; it only switched to 1.5 because Hyrum finished his
  implementation. (And there are a lot of good things about log -g; I
  certainly have respect for Hyrum's work, and expect that 1.6 could
  contain a fixed version of it.) This would allow us to ignore the
  "log -g" bugs for now and focus on bugs more central to merge
  tracking as opposed to just this one auditing feature.

* Because we no longer need it, we remove the sqlite mergeinfo index
  from 1.5. This reduces a huge amount of code complexity in the FS
  backends, and lets us not worry about fixing the bugs in keeping the
  indices up to date. Because we don't need it, we don't use my
  extra-metadata-in-DAG thing either.

When working on 1.6, we can solve #2897 and fix "log -g" with much
more leisure to get it right. If fixing them requires retrying the
sqlite index again, or my metadata idea, then so be it: we can add
that code back in in 1.6 (it's all in version control) and make it
work for those needs then.

But I think we can make 1.5 much more solid and less complex by simply
deferring #2897 and "log -g" to 1.6. 1.5 will still have a superset
of svnmerge.py's features.

(I don't mean to disrespect the hard work done on the sqlite backend,
"log -g", or issue-2897 here. I just think that these are difficult
problems to solve, and that making a release that doesn't try to solve
them and fixing them with more leisure is better than trying to do
everything at once and being full of bugs.)

--dave

-- 
David Glasser | glasser_at_davidglasser.net | http://www.davidglasser.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Nov 30 02:12:21 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.