[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Making blame even faster

From: Daniel Berlin <dberlin_at_dberlin.org>
Date: 2005-02-09 16:24:06 CET

The only complaint i have heard from gcc people so far is the speed of
blame/annotate. I'm pretty sure that if i can make blame run at
reasonable speed, nobody will object to us switching.
I would appreciate *any* help people can offer me in implementing any of
the solutions below.

There are a couple things to note:
1. Blame on some of our files takes > 30 minutes.
2. Nobody cares whether their blame is exactly like cvs, as long as it
has some sane format.
3. Byte level blame is fine (IE we don't need to care about lines).

The current blame is slow because it actually expands revisions (using
get_file_revs), AFAICT. This makes it badly O(n^2), at best.

There are a couple ways to fix this:

1. For a simple byte level blame, we should be able to get the answers
from the diff format alone, without actually expanding it. You can
simply mark the affected byte ranges, and stop when the whole file has
been affected. What the output looks like is up for grabs.

2. You could actually do a similar thing with line level blame.
Discover the line byte ranges of the current revision, throw it into an
interval tree, and then do queries to see which line a given diff
affects, based on it's byte ranges. Stop when every line has been
affected.

2 seems a bit harder than 1.

I've been out of it for a while, so maybe there is a something about
implementing #1 i have missed.
Have i missed anything about #1 that makes it hard?

It seems like one can reuse some of the code that does delta
composition, since it probably is already doing a lot of this work.

--Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Feb 9 16:25:36 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.