[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Looking to improve performance of svn annotate

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Fri, 20 Aug 2010 21:11:55 +0200

On Fri, Aug 20, 2010 at 1:40 PM, Johan Corveleyn <jcorvel_at_gmail.com> wrote:
> After eliminating antivirus, and running with a release build instead
> of a debug build, svn diff is just about on par with GNU diff. So this
> eliminates the option of optimizing diff ...

Unless ...

For every diff during blame calculation, tokens (lines) are extracted
and processed each time for the "original" and the "modified". This
takes up most of the time of svn diff. However, the file that is
playing the "modified" role in one step, will be the "original" in the
next step of blame processing. So in theory we already have those
tokens from the previous step, and don't have to extract/compare/...
them again.

If one could find a way to recycle the tokens from the "modified" of
the previous diff into the tokens of the "original" of the next diff,
that would probably make the diff part of the blame operation almost
twice as fast. And since diffing still accounts for ~90% of blame time
on the client, that should speed it up considerably.

Sounds like a plan?

I'll try to write some sort of POC for this idea soon, unless someone
tells me it's a stupid idea :-).

Cheers,

-- 
Johan
Received on 2010-08-20 21:12:34 CEST

This is an archived mail posted to the Subversion Dev mailing list.