[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn_diff.h truncated

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Mon, 14 Feb 2011 22:22:42 +0100

On Wed, Feb 9, 2011 at 12:29 AM, Johan Corveleyn <jcorvel_at_gmail.com> wrote:
> On Tue, Feb 8, 2011 at 11:19 PM, Johan Corveleyn <jcorvel_at_gmail.com> wrote:
>> On Tue, Feb 8, 2011 at 10:50 PM, Stefan Sperling <stsp_at_elego.de> wrote:
>>> On Tue, Feb 08, 2011 at 10:45:03PM +0100, Johan Corveleyn wrote:
>>>> On Tue, Feb 8, 2011 at 9:01 PM, Johan Corveleyn <jcorvel_at_gmail.com> wrote:
>>>> > On Tue, Feb 8, 2011 at 1:28 PM, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
>>>> >> I ran into this while trying to 'svn up' today, and managed to reproduce
>>>> >> it in another working copy:
>>>> >>
>>>> >> % svn revert -R .
>>>> >> % cat ./before
>>>> >> Index: subversion/include/svn_diff.h
>>>> >> ===================================================================
>>>> >> --- subversion/include/svn_diff.h       (revision 1067829)
>>>> >> +++ subversion/include/svn_diff.h       (working copy)
>>>> >> @@ -198,7 +198,7 @@
>>>> >>  svn_error_t *
>>>> >>  svn_diff_diff_2(svn_diff_t **diff,
>>>> >>                 void *diff_baton,
>>>> >> -                const svn_diff_fns2_t *diff_fns,
>>>> >> +                const svn_diff_fns2_t *diff_fns2,
>>>> >>                 apr_pool_t *pool);
>>>> >>
>>>> >>  /** Given a vtable of @a diff_fns/@a diff_baton for reading datasources,
>>>> >> @@ -224,7 +224,7 @@
>>>> >>  svn_error_t *
>>>> >>  svn_diff_diff3_2(svn_diff_t **diff,
>>>> >>                  void *diff_baton,
>>>> >> -                 const svn_diff_fns2_t *diff_fns,
>>>> >> +                 const svn_diff_fns2_t *diff_fns2,
>>>> >>                  apr_pool_t *pool);
>>>> >>
>>>> >>  /** Given a vtable of @a diff_fns/@a diff_baton for reading datasources,
>>>> >> @@ -252,7 +252,7 @@
>>>> >>  svn_error_t *
>>>> >>  svn_diff_diff4_2(svn_diff_t **diff,
>>>> >>                  void *diff_baton,
>>>> >> -                 const svn_diff_fns2_t *diff_fns,
>>>> >> +                 const svn_diff_fns2_t *diff_fns2,
>>>> >>                  apr_pool_t *pool);
>>>> >>
>>>> >>  /** Given a vtable of @a diff_fns/@a diff_baton for reading datasources,
>>>> >> % patch -p0 < ./before
>>>> >> % svn up -q subversion/include/
>>>> >> % cat -n subversion/include/svn_diff.h | tail
>>>> >>   296   * Differences, similarities, and conflicts are described by lining up
>>>> >>   297   * "ranges" of data.
>>>> >>   298   *
>>>> >>   299   * @note These callbacks describe data ranges in units of "tokens".
>>>> >>   300   * A "token" is whatever you've defined it to be in your datasource
>>>> >>   301   * @c svn_diff_fns_t vtable.
>>>> >>   302   */
>>>> >>   303  typedef struct svn_diff_output_fns_t
>>>> >>   304  {
>>>> >>   305    /* Two-way and three-way diffs both call the first two output functions: */
>>>> >> %
>>>> >
>>>> > Wow, weird. I can reproduce it too. I'm looking into it.
>>>>
>>>> I'm continuing to try and fix this. For now, some thoughts I typed on IRC:
>>>>
>>>> [22:29] <@jcorvel> ok, I'm starting to understand the truncating business
>>>> [22:30] <@jcorvel> diff is not affected, because it only outputs the
>>>> modified stuff from the diff chain
>>>> [22:30] <@jcorvel> (with their context)
>>>> [22:30] <@jcorvel> but diff3 outputs the entire chain ...
>>>> [22:31] <@ehu> which is missing its last record?
>>>> [22:31] <@jcorvel> yes
>>>> [22:31] <@ehu> evil
>>>> [22:32] <@jcorvel> I don't add the suffix to that diff chain (except
>>>> the first 50 lines, because I don't consider them part of the
>>>> identical suffix, to help diff/blame find a good way to output it)
>>>> [22:33] <@jcorvel> that was really an early optimization (when I was
>>>> only focusing on diff2): I saw that it didn't need the identical
>>>> suffix to do its work correctly
>>>> [22:33] <@jcorvel> that way, I could avoid counting lines while
>>>> scanning the identical suffix (which I do need to do for prefix
>>>> scanning, but I thought I could get away with it for the suffix)
>>>> [22:34] <@jcorvel> only later I included diff3 and diff4, and
>>>> considered everything ok when the test-suite passed
>>>> [22:34] <@jcorvel> if there would have been a merge test with more
>>>> than 50 common lines at the end, that test would have failed ...
>>>> [22:36] <@jcorvel> so, there are a couple of ways to fix this:
>>>> [22:36] <@jcorvel> 0. remove suffix scanning
>>>> [22:36] <@jcorvel> 1. make suffix scanning count lines, so it can
>>>> include a "common" diff chunk at the end of the chain, with the
>>>> correct nr of lines
>>>> [22:37] <@jcorvel> 2. have some way for a diff chunk in that chain to
>>>> indicate "until the end", and make svn_diff_file_output_merge2 cope
>>>> with that
>>>> [22:37] <@jcorvel> that's all I can think of right now
>>>>
>>>> Any opinions on which solution I should pursue? For now, I'm looking
>>>> at how difficult 2. would be. But if it takes too long, maybe I should
>>>> simple do 0. to eliminate potential wc corruption for people working
>>>> with trunk svn ...
>>>
>>> I'd suggest commit something for approach 0 now (but don't remove lots
>>> of code, a simple #ifdef or some other way of avoiding to call the code
>>> is fine).  And then look into 1 or 2 :)
>>
>> Ok, done: r1068613
>>
>> That takes the pressure off a bit. Though I really wouldn't want to
>> lose that half of the optimization :-(, so I'll continue working on a
>> better fix ...
>
> I think I'd better go for option 1: counting lines while
> suffix-scanning, so the entire diff chain is correct. Seems like the
> safe thing to do. Special casing while outputting the diff/merge seems
> like a bad idea.

Just to wrap this up: suffix scanning was re-enabled in r1069961, with
option 1 as described above (counting suffix lines, making sure the
LCS is completely correct).

I hope people's working copies are safe again ...

Cheers,

-- 
Johan
Received on 2011-02-14 22:23:44 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.