[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] fix for diff optimization bug

From: Daniel Shahaf <danielsh_at_elego.de>
Date: Tue, 25 Dec 2012 18:25:23 +0200

Hideki IWAMOTO wrote on Wed, Dec 26, 2012 at 00:31:16 +0900:
> Hi, Daniel.
>
> > You might be interested to know that on the 'trunk', a new test has already been added for a similar problem:
> >
> > $ ./subversion/tests/libsvn_diff/diff-diff3-test --list
> > [...]
> > ? 13??? XFAIL? difference at the start of a 128KB window
>
> I have some questions about the test which you added for issue #4133.
>
> 1. Enum constant corresponding to the option `-w' is svn_diff_file_ignore_space_all.
> But, you have set svn_diff_file_ignore_space_change to ignore_space.
> Why?
>

That's probably a bug in the test: it is failing now because it expects
"ba \n" and "ba\n" to be considered equal, but it should be using
svn_diff_file_ignore_space_all for them to be considered equal.

I don't care whether the test uses svn_diff_file_ignore_space_change or
svn_diff_file_ignore_space_all, so long as it tests for issue #4133.

> subversion/tests/libsvn_diff/diff-diff3-test.c:
> 2397: /* Issue #4133, '"diff -x -w" showing wrong change'.
> ...
> 2408: diff_opts->ignore_space = svn_diff_file_ignore_space_change;
>
>
> 2. The 128KB chunk size is used only for diff on file, and it seems not to affect on
> diff on memory.
> Why do you use svn_diff_mem_string_diff?
>

Because I assumed using svn_diff_mem_* would reproduce issue #4133. If
it doesn't, the test should be amended to use an API that will reproduce
the issue.

> subversion/tests/libsvn_diff/diff-diff3-test.c:
> 2398: The magic number used in this test, 1<<17, is
> 2399: CHUNK_SIZE from ../../libsvn_diff/diff_file.c
> ...
> 2425: SVN_ERR(svn_diff_mem_string_diff(&diff, &left, &right, diff_opts, pool));
>

Thanks for catching these problems. Would you be able to write patches
to fix them? It sounds like you're already more familiar with the diff
code than I am. :)

Cheers

Daniel

>
>
> On Mon, 24 Dec 2012 20:51:09 +0000 (GMT)
> Julian Foad <julianfoad_at_btopenworld.com> wrote:
>
> > Hi, Hideki.? Thank you very much for finding this bug and a fix for it.
> > You might be interested to know that on the 'trunk', a new test has already been added for a similar problem:
> >
> > $ ./subversion/tests/libsvn_diff/diff-diff3-test --list
> > [...]
> > ? 13??? XFAIL? difference at the start of a 128KB window
> >
> > We don't have a fix for this bug yet.? I don't know whether this bug also exists in version 1.7.8.
> >
> >
> > It would be very useful to have a regression test for the bug that you have found.? Would you be able to convert your reproduction recipe into a regression test written in C like the that one on trunk?
> > Please let us know if you would be willing to write a test for the bug you found, and/or port test 13 to version 1.7, and/or write a patch to fix the bug shown by test number 13.? We can treat them as two entirely separate problems, but maybe you have the skill and wish to help fix both of them.
> >
> > - Julian
> >
> >
> > Hideki IWAMOTO wrote:
> >
> > > The optimization of diff inclued in version 1.7 has a bug that
> > > produces incorrect diff on a certain condition.
> > > The attached patch fix it.
> > >
> > >
> > > Detail of the bug
> > > -----------------
> > >
> > > When the identical suffix begins at the boundary of a chunk,
> > > datasource_get_next_token() defined in subversion/libsvn_diff/diff_file.c
> > > does not stop at head of the identical suffix.
> > > Therefore, when one of the identical suffixes of the original file
> > > and the modified file begin from the boundary of a chunk,
> > > excessive tokens are added to the diff tree.
> > >
> > > How to reproduce
> > > ----------------
> > >
> > > $ for ((i=0;i<8256;i++)); do echo 0123456789abcde; done > test.txt
> > > $ hexdump -C test.txt
> > > 00000000? 30 31 32 33 34 35 36 37? 38 39 61 62 63 64 65 0a? |0123456789abcde.|
> > > *
> > > 00020400
> > > $ svn add test.txt; svn ci -m test
> > > A? ? ? ? test.txt
> > > Adding? ? ? ? test.txt
> > > Transmitting file data .
> > > Committed revision 2.
> > > $ echo 0123456789ABCDE |dd of=test.txt bs=16 seek=64 conv=notrunc
> > > 1+0 records in
> > > 1+0 records out
> > > $ echo 0123456789ABCDE |dd of=test.txt bs=16 seek=8141 conv=notrunc
> > > 1+0 records in
> > > 1+0 records out
> > > $ echo 0123456789abcde >> test.txt
> > > $ echo 0123456789abcde >> test.txt
> > > $ hexdump -C test.txt
> > > 00000000? 30 31 32 33 34 35 36 37? 38 39 61 62 63 64 65 0a? |0123456789abcde.|
> > > *
> > > 00000400? 30 31 32 33 34 35 36 37? 38 39 41 42 43 44 45 0a? |0123456789ABCDE.|
> > > 00000410? 30 31 32 33 34 35 36 37? 38 39 61 62 63 64 65 0a? |0123456789abcde.|
> > > *
> > > 0001fcd0? 30 31 32 33 34 35 36 37? 38 39 41 42 43 44 45 0a? |0123456789ABCDE.|
> > > 0001fce0? 30 31 32 33 34 35 36 37? 38 39 61 62 63 64 65 0a? |0123456789abcde.|
> > > *
> > > 00020420
> > > $ svn cat test.txt | diff -u - test.txt
> > > --- -? 2012-12-24 22:30:18.760832000 +0900
> > > +++ test.txt? ? 2012-12-24 22:29:24.000000000 +0900
> > > @@ -62,6 +62,7 @@
> > > 0123456789abcde
> > > 0123456789abcde
> > > 0123456789abcde
> > > +0123456789ABCDE
> > > 0123456789abcde
> > > 0123456789abcde
> > > 0123456789abcde
> > > @@ -8138,6 +8139,7 @@
> > > 0123456789abcde
> > > 0123456789abcde
> > > 0123456789abcde
> > > +0123456789ABCDE
> > > 0123456789abcde
> > > 0123456789abcde
> > > 0123456789abcde
> > > $ svn di test.txt
> > > Index: test.txt
> > > ===================================================================
> > > --- test.txt? ? (revision 2)
> > > +++ test.txt? ? (working copy)
> > > @@ -62,6 +62,7 @@
> > > 0123456789abcde
> > > 0123456789abcde
> > > 0123456789abcde
> > > +0123456789ABCDE
> > > 0123456789abcde
> > > 0123456789abcde
> > > 0123456789abcde
> > > @@ -8138,6 +8139,7 @@
> > > 0123456789abcde
> > > 0123456789abcde
> > > 0123456789abcde
> > > +0123456789ABCDE
> > > 0123456789abcde
> > > 0123456789abcde
> > > 0123456789abcde
> > > @@ -8188,6 +8190,72 @@
> > > 0123456789abcde
> > > 0123456789abcde
> > > 0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > +0123456789abcde
> > > 0123456789abcde
> > > 0123456789abcde
> > > 0123456789abcde
> > >
> > >
> > > --
> > > Hideki IWAMOTO <h-iwamoto_at_kit.hi-ho.ne.jp>
> > >
>
> --
> Hideki IWAMOTO <h-iwamoto_at_kit.hi-ho.ne.jp>
>
Received on 2012-12-25 17:26:17 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.