[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn diff optimization to make blame faster?

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Tue, 24 Aug 2010 10:04:40 +0200

On Sun, Aug 22, 2010 at 4:02 PM, Branko Čibej <brane_at_xbc.nu> wrote:
> On 18.08.2010 00:59, Johan Corveleyn wrote:
>> Hi devs,
>>
>> While "Looking to improve performance of svn annotate" [1], I found
>> that the current blame algorithm is mainly client-side bound, and that
>> most of its time is spent on "svn diff" (calls to svn_diff_file_diff_2
>> from add_file_blame in blame.c). Apart from avoiding to build
>> full-texts and diffing them altogether (which is subject of further
>> discussion in [1]), I'm wondering if optimization of "svn diff"
>> wouldn't also be an interesting way to improve the speed of blame.
>>
>> So the main question is: is it worth it to spend time to analyze this
>> further and try to improve performance? Or has this already been
>> optimized in the past, or is it simply already as optimal as it can
>> get? I have no idea really, so if anyone can shed some light ...
>>
>> Gut feeling tells me that there must be room for optimization, since
>> GNU diff seems a lot faster than svn diff for the same large file
>> (with one line changed) on my machine [1]. But maybe svn's diff
>> algorithm is purposefully different (better? more accurate? ...) than
>> GNU's, or there are specific things in the svn context so svn diff has
>> to do more work.
>>
>> Any thoughts?
>>
>
> svn_diff uses basically the same algorithm as GNU diff but implemented
> slightly differently and IIRC it doesn't have some of GNU diff's
> optimizations. I'm sure it can be speeded up, but haven't a clue about
> how much.

Ok, thanks. In the meantime I saw that there is not that much
difference anymore between GNU diff and svn_diff, after running the
latter from a release build, and disabling my anti-virus (which makes
me wonder why my anti-virus slows down svn_diff (impact when opening
the modified datasource), but not on GNU diff).

There may still be some slight speed difference (enough to be
significant for a blame operation doing 100's or 1000's of diffs), but
not that much as I thought at first. So I don't think I'm going to
spend more time on trying to speed up svn_diff (also, I'm not really
an expert at optimizing C code, so ... I'll leave that to others :-)).

Cheers,

-- 
Johan
Received on 2010-08-24 10:05:15 CEST

This is an archived mail posted to the Subversion Dev mailing list.