major Subversion progress

From: Karl Fogel <kfogel_at_collab.net>
Date: 2001-07-13 00:38:44 CEST

I know not everyone is following the CVS commit mails closely, so I'd
like to point out that something totally awesome just happened:

The Subversion repository now does deltified storage. In other words,
we can take our filesystem code seriously now! :-) Big-time kudos to
Mike Pilato <cmpilato@collab.net> for finishing this one off.

This was the last truly major coding task on the todo list, as far as
Subversion itself goes -- everything else we need to get to alpha is
relatively small: keyword substitution and newline conversion, some
fixes in the working copy code, etc. Some things, like the hook and
auth systems, need some more design think-time, but once designed,
they are easy to implement.

A couple of notes:

   1. There is still one big thing that needs to happen before M3: a
      cvs2svn repository converter :-). If anyone wants to volunteer,
      that would be a fine thing... You wouldn't be all alone: many
      on this list have a good understanding of how the converter
      needs to work, and various ways to make it efficient. The main
      reason this is a big task is that the converter needs to have a
      *very* thorough test suite, because (obviously) we can't afford
      to corrupt data.

   2. The deltification code still has room for improvement. While
      the conversion from fulltext to delta is quite efficient
      (probably only minor improvements are possible from here), the
      undeltification code will probably want tweaking at some point
      in the future. Currently, fulltext reconstruction uses memory
      proportional to the delta window size times the number of
      intermediate deltas between the target and the "base" fulltext
      (the latter usually being the most recent revision of the
      object); there are also some running time issues.

Regarding (2):

Mike, Ben, and I all feel is that it would be premature optimization
to write the Ultimate Efficient Delta Combiner at this time, when
there are so many other important tasks that need doing. It is
perfectly acceptable to get to 1.0 with the undeltification method we
have now, which should perform well enough in most circumstances.

Doing real delta combination will get us a big-O improvement for
retrieving large files, and we more or less know how to do it.
However, it is a very complex coding task and IOHO it's more important
to get Subversion to feature-completeness than to optimize every last
drop out of the delta reconstruction algorithm. (This is not intended
to discourage others from working on it right now if they want, of
course!)

With all this in mind:

Mike and I are going to spend tomorrow writing a slew of new fs tests,
whose goal will be to stress the new deltified storage code, and make
sure it performs acceptably at least where CVS performs acceptably.
Then starting Monday, we're going to bid the deltification code adieu
and join everyone else on the rest of alpha-checklist.txt. The goal
is to get foolproof regression insurance in place and then move on.
That way, when we later do improve the algorithm, we will know right
away if it works and won't need to worry about whether it's
functionally equivalent to the old code.

Okay, enough happy-talk, clearly there's still a lot of code to write.
Back to work now... Keep those patches coming!

-K

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:33 2006

This message: [ Message body ]
Next message: Timothee Besset: "Re: major Subversion progress"
Previous message: Karl Fogel: "Re: Two problems w/ subversion"
Next in thread: Timothee Besset: "Re: major Subversion progress"
Reply: Timothee Besset: "Re: major Subversion progress"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]