[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: diff4-optimization-bytes

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Fri, 28 Jan 2011 14:04:07 +0100

On Fri, Jan 28, 2011 at 9:29 AM, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
> Johan,
>
> I'm concerned about this change: on the one hand, it's untested and
> no one claims to be understanding the code; on the other hand, it
> doesn't exactly parallel the diff3 change:
>
> specifically, the last hunk of the diff3 patch (which is also included
> below) has no equivalent in the diff4 patch.  A wild guess is that the
> parameters to svn_diff__diff() might need to be adjusted.
>
> All in all, given your warnings in the log message, I have zero reason
> to believe the change is correct, and I just cited one reason I have to
> believe it's wrong :-).

I admit it's kind of a vulnerable spot in the work I did, because of
the absence of tests. I made those changes simply based on some
generic understanding about the code structure, and with analogy to
the other code.

However, I'm pretty sure that it's correct. That last hunk in the
diff3 patch is not there in diff4, because the analogous piece of code
isn't there in diff4 (and it also isn't in diff2). That's because
diff3 does something quite special: it tries to manually "synchronize"
between the lcs information from "original-modified" and
"original-latest". That isn't done in diff2 nor in diff4. So AFAIU the
changes to diff4 more or less parallel the changes to diff2, and it's
diff3 that is 'special'.

However, I also don't feel quite at easy with this situation, so ...

> May I suggest that, if this code is to be released, then you validate
> its correctnss?  For example, a minimal regression test that is written
> to record trunk's pre-branch behaviour might be sufficient.

... I'll accept your suggestion. I'll try to take some time for that
this weekend. If anyone beats me to it, that would be fine as well of
course :).

Cheers,
Johan

>
> Best,
>
> Daniel
>
>
> jcorvel_at_apache.org wrote on Thu, Jan 13, 2011 at 22:02:38 -0000:
>> Author: jcorvel
>> Date: Thu Jan 13 22:02:38 2011
>> New Revision: 1058759
>>
>> URL: http://svn.apache.org/viewvc?rev=1058759&view=rev
>> Log:
>> On the diff-optimizations-bytes branch:
>>
>> Adapt the diff4 algorithm, as far as I understand it, to make use of the
>> prefix/suffix optimization.
>>
>> Note:
>> - diff4 isn't used in svn core, only in tools/diff/diff4.
>> - This change is untested, because there is currently no test in the test
>>   suite exercising this algorithm, and I don't understand it well enough to
>>   write tests for it. Review and/or adding tests for this code is thus very
>>   welcome.
>>
>> * subversion/libsvn_diff/diff4.c
>>   (svn_diff_diff4): Open the 4 datasources together with datasources_open, to
>>   let it eliminate identical prefix and suffix.
>>
>> Modified:
>>     subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff4.c
>>
>> Modified: subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff4.c
>> URL: http://svn.apache.org/viewvc/subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff4.c?rev=1058759&r1=1058758&r2=1058759&view=diff
>> ==============================================================================
>> --- subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff4.c (original)
>> +++ subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff4.c Thu Jan 13 22:02:38 2011
>> @@ -173,6 +173,10 @@ svn_diff_diff4(svn_diff_t **diff,
>>  {
>>    svn_diff__tree_t *tree;
>>    svn_diff__position_t *position_list[4];
>> +  svn_diff_datasource_e datasource[] = {svn_diff_datasource_original,
>> +                                        svn_diff_datasource_modified,
>> +                                        svn_diff_datasource_latest,
>> +                                        svn_diff_datasource_ancestor};
>>    svn_diff__lcs_t *lcs_ol;
>>    svn_diff__lcs_t *lcs_adjust;
>>    svn_diff_t *diff_ol;
>> @@ -181,6 +185,7 @@ svn_diff_diff4(svn_diff_t **diff,
>>    apr_pool_t *subpool;
>>    apr_pool_t *subpool2;
>>    apr_pool_t *subpool3;
>> +  apr_off_t prefix_lines = 0;
>>
>>    *diff = NULL;
>>
>> @@ -190,36 +195,38 @@ svn_diff_diff4(svn_diff_t **diff,
>>
>>    svn_diff__tree_create(&tree, subpool3);
>>
>> +  SVN_ERR(vtable->datasources_open(diff_baton, &prefix_lines, datasource, 4));
>> +
>>    SVN_ERR(svn_diff__get_tokens(&position_list[0],
>>                                 tree,
>>                                 diff_baton, vtable,
>>                                 svn_diff_datasource_original,
>> -                               FALSE,
>> -                               0,
>> +                               TRUE,
>> +                               prefix_lines,
>>                                 subpool2));
>>
>>    SVN_ERR(svn_diff__get_tokens(&position_list[1],
>>                                 tree,
>>                                 diff_baton, vtable,
>>                                 svn_diff_datasource_modified,
>> -                               FALSE,
>> -                               0,
>> +                               TRUE,
>> +                               prefix_lines,
>>                                 subpool));
>>
>>    SVN_ERR(svn_diff__get_tokens(&position_list[2],
>>                                 tree,
>>                                 diff_baton, vtable,
>>                                 svn_diff_datasource_latest,
>> -                               FALSE,
>> -                               0,
>> +                               TRUE,
>> +                               prefix_lines,
>>                                 subpool));
>>
>>    SVN_ERR(svn_diff__get_tokens(&position_list[3],
>>                                 tree,
>>                                 diff_baton, vtable,
>>                                 svn_diff_datasource_ancestor,
>> -                               FALSE,
>> -                               0,
>> +                               TRUE,
>> +                               prefix_lines,
>>                                 subpool2));
>>
>>    /* Get rid of the tokens, we don't need them to calc the diff */
>> @@ -230,7 +237,8 @@ svn_diff_diff4(svn_diff_t **diff,
>>    svn_pool_clear(subpool3);
>>
>>    /* Get the lcs for original - latest */
>> -  lcs_ol = svn_diff__lcs(position_list[0], position_list[2], 0, subpool3);
>> +  lcs_ol = svn_diff__lcs(position_list[0], position_list[2], prefix_lines,
>> +                         subpool3);
>>    diff_ol = svn_diff__diff(lcs_ol, 1, 1, TRUE, pool);
>>
>>    svn_pool_clear(subpool3);
>> @@ -251,7 +259,8 @@ svn_diff_diff4(svn_diff_t **diff,
>>    /* Get the lcs for common ancestor - original
>>     * Do reverse adjustements
>>     */
>> -  lcs_adjust = svn_diff__lcs(position_list[3], position_list[2], 0, subpool3);
>> +  lcs_adjust = svn_diff__lcs(position_list[3], position_list[2], prefix_lines,
>> +                             subpool3);
>>    diff_adjust = svn_diff__diff(lcs_adjust, 1, 1, FALSE, subpool3);
>>    adjust_diff(diff_ol, diff_adjust);
>>
>> @@ -260,7 +269,8 @@ svn_diff_diff4(svn_diff_t **diff,
>>    /* Get the lcs for modified - common ancestor
>>     * Do forward adjustments
>>     */
>> -  lcs_adjust = svn_diff__lcs(position_list[1], position_list[3], 0, subpool3);
>> +  lcs_adjust = svn_diff__lcs(position_list[1], position_list[3], prefix_lines,
>> +                             subpool3);
>>    diff_adjust = svn_diff__diff(lcs_adjust, 1, 1, FALSE, subpool3);
>>    adjust_diff(diff_ol, diff_adjust);
>>
>>
>>
>
> jcorvel_at_apache.org wrote on Thu, Jan 13, 2011 at 21:38:16 -0000:
>> Author: jcorvel
>> Date: Thu Jan 13 21:38:16 2011
>> New Revision: 1058753
>>
>> URL: http://svn.apache.org/viewvc?rev=1058753&view=rev
>> Log:
>> On the diff-optimizations-bytes branch:
>>
>> Adapt the diff3 algorithm to make use of the prefix/suffix optimization.
>>
>> * subversion/libsvn_diff/diff3.c
>>   (svn_diff_diff3): Open the 3 datasources together with datasources_open, to
>>    let it eliminate identical prefix and suffix. Don't forget to adjust the
>>    sentinel_positions with the prefix_lines, in the case of an empty
>>    position_list.
>>
>> Modified:
>>     subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff3.c
>>
>> Modified: subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff3.c
>> URL: http://svn.apache.org/viewvc/subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff3.c?rev=1058753&r1=1058752&r2=1058753&view=diff
>> ==============================================================================
>> --- subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff3.c (original)
>> +++ subversion/branches/diff-optimizations-bytes/subversion/libsvn_diff/diff3.c Thu Jan 13 21:38:16 2011
>> @@ -251,10 +251,14 @@ svn_diff_diff3(svn_diff_t **diff,
>>  {
>>    svn_diff__tree_t *tree;
>>    svn_diff__position_t *position_list[3];
>> +  svn_diff_datasource_e datasource[] = {svn_diff_datasource_original,
>> +                                        svn_diff_datasource_modified,
>> +                                        svn_diff_datasource_latest};
>>    svn_diff__lcs_t *lcs_om;
>>    svn_diff__lcs_t *lcs_ol;
>>    apr_pool_t *subpool;
>>    apr_pool_t *treepool;
>> +  apr_off_t prefix_lines = 0;
>>
>>    *diff = NULL;
>>
>> @@ -263,28 +267,30 @@ svn_diff_diff3(svn_diff_t **diff,
>>
>>    svn_diff__tree_create(&tree, treepool);
>>
>> +  SVN_ERR(vtable->datasources_open(diff_baton, &prefix_lines, datasource, 3));
>> +
>>    SVN_ERR(svn_diff__get_tokens(&position_list[0],
>>                                 tree,
>>                                 diff_baton, vtable,
>>                                 svn_diff_datasource_original,
>> -                               FALSE,
>> -                               0,
>> +                               TRUE,
>> +                               prefix_lines,
>>                                 subpool));
>>
>>    SVN_ERR(svn_diff__get_tokens(&position_list[1],
>>                                 tree,
>>                                 diff_baton, vtable,
>>                                 svn_diff_datasource_modified,
>> -                               FALSE,
>> -                               0,
>> +                               TRUE,
>> +                               prefix_lines,
>>                                 subpool));
>>
>>    SVN_ERR(svn_diff__get_tokens(&position_list[2],
>>                                 tree,
>>                                 diff_baton, vtable,
>>                                 svn_diff_datasource_latest,
>> -                               FALSE,
>> -                               0,
>> +                               TRUE,
>> +                               prefix_lines,
>>                                 subpool));
>>
>>    /* Get rid of the tokens, we don't need them to calc the diff */
>> @@ -295,9 +301,9 @@ svn_diff_diff3(svn_diff_t **diff,
>>    svn_pool_destroy(treepool);
>>
>>    /* Get the lcs for original-modified and original-latest */
>> -  lcs_om = svn_diff__lcs(position_list[0], position_list[1], 0,
>> +  lcs_om = svn_diff__lcs(position_list[0], position_list[1], prefix_lines,
>>                           subpool);
>> -  lcs_ol = svn_diff__lcs(position_list[0], position_list[2], 0,
>> +  lcs_ol = svn_diff__lcs(position_list[0], position_list[2], prefix_lines,
>>                           subpool);
>>
>>    /* Produce a merged diff */
>> @@ -330,7 +336,7 @@ svn_diff_diff3(svn_diff_t **diff,
>>        }
>>      else
>>        {
>> -        sentinel_position[0].offset = 1;
>> +        sentinel_position[0].offset = prefix_lines + 1;
>>          sentinel_position[0].next = NULL;
>>          position_list[1] = &sentinel_position[0];
>>        }
>> @@ -344,7 +350,7 @@ svn_diff_diff3(svn_diff_t **diff,
>>        }
>>      else
>>        {
>> -        sentinel_position[1].offset = 1;
>> +        sentinel_position[1].offset = prefix_lines + 1;
>>          sentinel_position[1].next = NULL;
>>          position_list[2] = &sentinel_position[1];
>>        }
>>
>>
>
>
Received on 2011-01-28 14:05:09 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.