[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Skip-deltas, for review

From: Daniel Berlin <dberlin_at_dberlin.org>
Date: 2002-07-27 03:06:36 CEST

On 26 Jul 2002, Greg Hudson wrote:

> On Fri, 2002-07-26 at 20:01, Karl Fogel wrote:
> > Hmmm. I hate to say it after the work you've done, but with these
> > numbers it's hard to see why Subversion should incorporate this
> > change.
>
> Well, that's a reasonable position to have, but our repository might be
> a deceptively simple case. As I noted in my first message, Branko's
> delta combiner doesn't actually seem to improve the speed of day-to-day
> operations in my tests either, when operating on our repository. (It
> does fix the deltifying-files-larger-than-chunk-size bug, because he
> rewrote the relevant piece of code, but that bug could certainly be
> fixed without introducing such a huge piece of machinery.)

I can say that when converting the gcc repo, where we have thousands upon
thousands of revisions of various files that are moderate size, like the
ChangeLog, if i make it deltify (it's 0k-800k depending on the
revision, so most won't deltify unless i up the stream size) without
skip-deltas and the delta combiner, trying to access revision 100 out of
4000 takes literally 10 minutes.
With just the delta combiner, it's 20 seconds.
With these two patches, it takes 4 seconds.

The first time is unacceptable for use.
The second is okay, but could be a lot better.
It is also not lost in the noise in a use case where *most* of the
accesses are read-only type operations (diffs between revisions, etc).

This is because the disk is going to be hit to access the database keys of
these revisions. If it's got to walk 2500 database keys vs. 40, and 300
people are trying to do this at once, your skip lists will be a *bit*
faster.

Most people using the gcc repo are *not* the developers of gcc, they are
users tracking it.

It is perfectly reasonable for them to compare revisions of changelogs
that are 2500 revisions apart (gcc 3.1 branch vs gcc head is 1.14977 -
1.13152.2.657, which is 1825 + 657 revisions away) , and when doing so, it
shouldn't take them 5 years and 8 minutes of the server's cpu time.
:)

--Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jul 27 03:07:07 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.