Re: svn diff optimization to make blame faster?

From: Hyrum K. Wright <hyrum_wright_at_mail.utexas.edu>
Date: Tue, 21 Sep 2010 14:15:46 +0100

On Mon, Sep 20, 2010 at 12:10 PM, Johan Corveleyn <jcorvel_at_gmail.com> wrote:
> On Mon, Sep 20, 2010 at 11:52 AM, Branko Čibej <brane_at_xbc.nu> wrote:
>> On 15.09.2010 14:20, Johan Corveleyn wrote:
>>> Some update on this: I have implemented this for svn_diff (excluding
>>> the identical prefix and suffix of both files, and only then starting
>>> to fill up the token tree and let the lcs-agorithm to its thing). It
>>> makes a *huge* difference. On my bigfile.xml (1.5 Mb) with only one
>>> line changed, the call to svn_diff_diff is ~10 times faster (15-20 ms
>>> vs. 150-170 ms).
>>
>>
>> Hmmm ... looks to me like test data tailored to the optimization. :)
>
> Nope, that's real data from a real repository, with a normal kind of
> change that happens here every day.
>
> Of course this optimization is most effective if there are a lot of
> common prefix/suffix lines. If there is a single change in the first
> line, and a single change in the last one, this optimization will do
> nothing but introduce a little bit of extra overhead. And it will
> obviously make the most impact on large files (in fact it's just
> relative to the ratio of the "number of common prefix/suffix lines" to
> the "number of lines in between").
>
> I'm sorry it takes me longer than expected to post a version of this
> to the list, but I'm still having some problems with a couple of edge
> conditions (I'm learning C as I go, and I'm struggling with a couple
> of pointer calculations/comparisons). I plan to post something during
> this week...

Johan,
No need to apologize. Thanks for coming to the retreat at Hursley
this past weekend; the discussion there really helped clarify some of
the concepts around your patches. Keep up the good work!

-Hyrum
Received on 2010-09-21 15:16:26 CEST

This message: [ Message body ]
Next message: Stefan Sperling: "Re: svn commit: r999421 - /subversion/branches/atomic-revprop/subversion/svnsync/main.c"
Previous message: Julian Reschke: "Re: [PATCH 3/3] atomic-revprop: Signal the error as a HTTP status code"
In reply to: Johan Corveleyn: "Re: svn diff optimization to make blame faster?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]