I am looking into issue #2543, which is about making 'svn diff'
show the full content of files after the file was renamed.
I though I'd share my current progress and ideas to get my approach
checked for sanity and gather any feedback you might have.
Below, I'll describe what svn diff did before my changes, what I would
like it to do instead, what I have done so far, and what problems I
am now facing.
== What svn originally did ==
When renaming a file locally in the WC, svn diff would show the
file deleted at the old path in its entirety, but not show it
at the new path:
$ svn move alpha alpha2
A alpha2
D alpha
$ svn diff
Index: alpha
===================================================================
--- alpha (revision 2)
+++ alpha (working copy)
@@ -1 +0,0 @@
-alpha
$
Only if the renamed file was also modified, it would appear in the diff
output, showing how it was modified relative to the move source:
$ echo aaa >> alpha2
$ svn diff
Index: alpha
===================================================================
--- alpha (revision 2)
+++ alpha (working copy)
@@ -1 +0,0 @@
-alpha
Index: alpha2
===================================================================
--- alpha2 (revision 2)
+++ alpha2 (working copy)
@@ -1 +1,2 @@
alpha
+aaa
$
The behaviour when diffing entire directories was similar, i.e. only
files which were modified in addition to being moved (or copied) with
their parent directory were shown in the diff output.
== What I'd like svn diff to do instead ==
There is a good reason for the above behaviour. It allows users to see
which local modifications they have made to a file after moving or
copying it. But this is not always what the user wants to know.
For example, when creating a patch which is supposed to be applied back
with UNIX patch (or svn patch, more on that later), it is impossible to
tell svn diff to include moved or copied files in their entirety. This
means that 'svn diff' is absolutely useless for creating patches which
add copied or renamed files or directories.
It should be possible to get either behaviour from 'svn diff'.
One of these behaviours will be the default, the other will be toggled
by a command line flag.
Note that files which were newly added without history are already shown
in their entirety, so there's no problem with added files or directories.
== What I've done so far ==
There is a flag which has a suitable name, --notice-ancestry.
It is documented in the book as follows:
By default, svn diff ignores the ancestry of files and merely compares the
contents of the two files being compared. If you use --notice-ancestry,
the ancestry of the paths in question will be taken into consideration when
comparing revisions (i.e., if you run svn diff on two files with identical
contents but different ancestry, you will see the entire contents of the
file as having been removed and added again).
I decided to generalise the meaning of the --notice-ancestry flag (this may
have been a wrong decision), and modified svn diff on trunk so that it now
displays copied files just like newly added files by default:
$ svn diff
Index: alpha
===================================================================
--- alpha (revision 2)
+++ alpha (working copy)
@@ -1 +0,0 @@
-alpha
Index: alpha2
===================================================================
--- alpha2 (revision 0)
+++ alpha2 (revision 2)
@@ -0,0 +1,2 @@
+alpha
+aaa
And the old behaviour is available with --notice-ancestry:
$ svn diff --notice-ancestry
Index: alpha
===================================================================
--- alpha (revision 2)
+++ alpha (working copy)
@@ -1 +0,0 @@
-alpha
Index: alpha2
===================================================================
--- alpha2 (revision 2)
+++ alpha2 (working copy)
@@ -1 +1,2 @@
alpha
+aaa
This was very easy to implement, since all it took was to add a
check for the --notice-ancestry flag at the point where we tweak
a copied file's schedule from 'add' to 'normal' to force comparison
to its text base, which we end up finding via copyfrom information.
See r39680.
I also got this to work when comparing the working copy to the repository:
$ svn diff -r2 alpha2
Index: alpha2
===================================================================
--- alpha2 (revision 0)
+++ alpha2 (revision 2)
@@ -0,0 +1,2 @@
+alpha
+aaa
$ svn diff --notice-ancestry -r1 alpha2
Index: alpha2
===================================================================
--- alpha2 (.../alpha) (revision 1)
+++ alpha2 (.../alpha2) (working copy)
@@ -1 +1,2 @@
alpha
+aaa
The change for this repos->wc diff case was a little more involved.
What I did was to make resolving of the pegged URL diff target via
svn_client__repos_locations() conditional on the --notice-ancestry option,
and passing entry->URL of the copied file unmodified if the flag is not set.
Since the file was locally copied and does (usually ;) not yet exist in the
repository, the resulting diff only shows added lines. See r39720.
Diffing with future revisions works as expected.
In the case where the file does exist in the left-diff revision,
the user is shown a diff between the locally copied file and the
file that exists in the repository.
For example, in a WC based on r2, we copied 'alpha' to 'foo',
and someone else already added a new file 'foo' added in r3:
$ svn diff foo
Index: foo
===================================================================
--- foo (revision 0)
+++ foo (revision 2)
@@ -0,0 +1 @@
+alpha
$ svn diff -r2 foo # foo does not exist in r2
Index: foo
===================================================================
--- foo (revision 0)
+++ foo (revision 2)
@@ -0,0 +1 @@
+alpha
$ svn diff -r3 foo # foo does exist in r3, however
Index: foo
===================================================================
--- foo (revision 3)
+++ foo (working copy)
@@ -1 +1 @@
-foo
+alpha
What does not work yet is 'svn diff -rX URL'.
While a WC path triggers the desired behaviour...
$ svn diff -r1 alpha2
Index: alpha2
===================================================================
--- alpha2 (revision 0)
+++ alpha2 (revision 3)
@@ -0,0 +1,2 @@
+alpha
+aaa
... a URL does not:
$ svn diff -r1 ^/trunk/alpha2
Index: alpha
===================================================================
--- alpha (.../alpha) (revision 1)
+++ alpha (.../alpha2) (revision 3)
@@ -1 +1,2 @@
alpha
+aaa
Also, so far my changes work for files, but not for directories.
I see no reason why this should not work for directories,
but I have not yet looked into fixing that.
== Problems ==
Assuming it can be made to work with directories as well as files,
the current solution makes the impossible use case possible.
But I can see a couple of problems with it already, and there may be
even more problems which I am not seeing yet.
1) It is possible that diff targets sitting in the repository cannot be
found with the new behaviour.
So if we enable it by default, we enable such problems by default.
E.g. diff_tests.py 33 has this problem, it errors out during a simple
'svn diff -r1' unless --notice-ancestry is passed:
svn: Try using the --notice-ancestry option
svn: Target path '/A_prime' does not exist
It's easy to reproduce the problem:
$ svn mkdir dir
A dir
$ svn ci -m"add dir"
Adding dir
Committed revision 3.
$ cd dir
$ svn diff -r2
svn: Try using the --notice-ancestry option
svn: Target path '/trunk/dir' does not exist
This gets worse, e.g. svn does not complain if we do:
$ cd ..
$ svn diff -r2 dir
$
Whoops...
Also, it looks like there is still a bug related to peg revision parsing:
$ cd dir
$ svn diff -r2 ._at_3
svn: '@3' is just a peg revision. Maybe try '@3@' instead?
2) When we add support for git extensions, the default will have to
change again, since copies are represented as:
diff --git a/{oldname} b/{newname}
copy from {oldname}
copy to {newname}
{normal unidiff data here if there was a difference, only for how new
is different from old}
Which means that as soon as we support git extensions, any intelligent
patching tool (and most importantly svn patch) can do the right thing
with just the delta between copy or move source and target. Changing the
default now just to change it again later is certainly not ideal.
Still, we will need new behaviour anyway to facilitate creation of
patches which less intelligent patching tools (such as UNIX patch)
can deal with. So I want to keep it, just not make it the default.
This implies adding a new command line flag to 'svn diff', because
--notice-ancestry cannot be used for this purpose. Any ideas for a
suitable name? --full-copies? --add-copies? --ignore-copyfrom?
--treat-copies-as-added-without-history?
3) I'd like this new 'svn diff' command line flag to behave consistently
across all possible use cases for 'svn diff'.
When doing pure repos->repos diffs, we currently always expect the
files being diffed to exist at the specified peg revisions (defaulting
to OLDREV and NEWREV with the second synopsis). If they don't exist
we currently error out. This would need to change.
If we allow either of the targets to be non-existent, it is possible
to diff any item in the repository against non-existent paths in the
repository, producing a diff that fully adds (or deletes) the non-missing
diff target. I have a local patch (see below) which results in the
following behaviour:
$ svn diff --old=^/does_not_exist --new=^/trunk/beta
Index: does_not_exist
===================================================================
--- does_not_exist (.../does_not_exist) (revision 0)
+++ does_not_exist (.../trunk/beta) (revision 3)
@@ -0,0 +1 @@
+beta
$ svn diff --old=^/trunk/beta --new=^/does_not_exist
Index: beta
===================================================================
--- beta (.../trunk/beta) (revision 3)
+++ beta (.../does_not_exist) (revision 3)
@@ -1 +0,0 @@
-beta
$ svn diff --notice-ancestry --old=^/trunk/beta \
--new=^/trunk/does_not_exist
svn: 'file:///tmp/svn-sandbox/repos/does_not_exist' was not found \
in the repository at revision 3
(Note how the diff uses the *old* name in the +++ diff header line,
meaning UNIX patch will create or delete a file with the wrong name.
This also happens with valid URLs and is probably a bug we need to fix.)
Since a simple typo can potentially lead to huge diff output with the
new behaviour, this is another reason why it should not be the default.
Am I on the right track?
Thanks,
Stefan
Index: subversion/libsvn_client/diff.c
===================================================================
--- subversion/libsvn_client/diff.c (revision 39749)
+++ subversion/libsvn_client/diff.c (working copy)
@@ -1094,7 +1094,7 @@ diff_prepare_repos_repos(const struct diff_paramet
(params->path2 == drr->url2) ? NULL : params_path2_abspath,
ra_session, params->revision2, pool));
SVN_ERR(svn_ra_check_path(ra_session, "", drr->rev2, &kind2, pool));
- if (kind2 == svn_node_none)
+ if (! params->ignore_ancestry && kind2 == svn_node_none)
return svn_error_createf
(SVN_ERR_FS_NOT_FOUND, NULL,
_("'%s' was not found in the repository at revision %ld"),
@@ -1106,7 +1106,7 @@ diff_prepare_repos_repos(const struct diff_paramet
(params->path1 == drr->url1) ? NULL : params_path1_abspath,
ra_session, params->revision1, pool));
SVN_ERR(svn_ra_check_path(ra_session, "", drr->rev1, &kind1, pool));
- if (kind1 == svn_node_none)
+ if (! params->ignore_ancestry && kind1 == svn_node_none)
return svn_error_createf
(SVN_ERR_FS_NOT_FOUND, NULL,
_("'%s' was not found in the repository at revision %ld"),
------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2402958
Received on 2009-10-02 16:56:29 CEST