[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Looking to improve performance of svn annotate

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: Wed, 11 Aug 2010 19:33:15 -0400

On Wed, 2010-08-11 at 19:14 -0400, Johan Corveleyn wrote:
> I naively thought that the server, upon being called get_file_revs2,
> would just supply the deltas which it has already stored in the
> repository. I.e. that the deltas are just the native format in which
> the stuff is kept in the back-end FS, and the server wasn't doing much
> else but iterate through the relevant files, and extract the relevant
> bits.

The server doesn't have deltas between each revision and the next (or
previous). Instead, it has "skip-deltas" which may point backward a
large number of revisions. This allows any revision of a file to be
reconstructed in O(log(n)) delta applications, where n is the number of
file revisions, but it makes "what the server has lying around" even
less useful for blame output.

It's probably best to think of the FS as a black box which can produce
any version of a file in reasonable time. If you look at
svn_repos_get_file_revs2 in libsvn_repos/rev_hunt.c, you'll see the code
which produces deltas to send to the client, using
svn_fs_get_file_delta_stream.

The required code changes for this kind of optimization would be fairly
deep, I think. You'd have to invent a new type of "diffy" delta
algorithm (either line-based or binary, but either way producing an edit
stream rather than acting like a compression algorithm), and then
parameterize a bunch of functions which produce deltas, and then have
the server-side code produce diffy deltas, and then have the client code
recognize when it's getting diffy deltas and behave more efficiently.

If the new diffy-delta algorithm isn't format-compatible with the
current encoding, you'd also need some protocol negotiation.
Received on 2010-08-12 01:34:00 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.