On Wed, Aug 18, 2010 at 12:59:21AM +0200, Johan Corveleyn wrote:
> Hi devs,
> 
> While "Looking to improve performance of svn annotate" [1], I found
> that the current blame algorithm is mainly client-side bound, and that
> most of its time is spent on "svn diff" (calls to svn_diff_file_diff_2
> from add_file_blame in blame.c). Apart from avoiding to build
> full-texts and diffing them altogether (which is subject of further
> discussion in [1]), I'm wondering if optimization of "svn diff"
> wouldn't also be an interesting way to improve the speed of blame.
> 
> So the main question is: is it worth it to spend time to analyze this
> further and try to improve performance? Or has this already been
> optimized in the past, or is it simply already as optimal as it can
> get? I have no idea really, so if anyone can shed some light ...
> 
> Gut feeling tells me that there must be room for optimization, since
> GNU diff seems a lot faster than svn diff for the same large file
> (with one line changed) on my machine [1]. But maybe svn's diff
> algorithm is purposefully different (better? more accurate? ...) than
> GNU's, or there are specific things in the svn context so svn diff has
> to do more work.
> 
> Any thoughts?
Can you show a profiler run that illustrates where the client is
spending most of its time during diff? That would probably help with
getting opinions from people, because it saves them from spending time
doing this research themselves.
You've already hinted at svn_diff__get_tokens() in another mail, but
a real profiler run would show more candidates.
Thanks,
Stefan
Received on 2010-08-18 12:50:29 CEST