[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Ideas for handling copies/moves in 'svn diff'

From: Stefan Sperling <stsp_at_elego.de>
Date: Sun, 30 May 2010 11:16:55 +0200

On Sat, May 29, 2010 at 02:34:06PM +0200, Daniel Näslund wrote:
> Problem 1
> ==============
> We're processing the delete-half instead of the add-half for a
> copy/move. At present, copy/move add-half is not shown by 'svn diff'
> unless invoked with '--show-copies-as-adds'. But with the 'git unidiff'
> format we would only show a diff for the add-part, e.g. we would change
> the behaviour from reporting the delete-half to reporting the add-half.

Why do we need to change the behaviour?

Right now, svn diff always shows deleted paths as deleted in the diff,
unless you pass --no-diff-deleted. svn patch already knows how to deal
with those files (delete them).

For copies, we don't show the add unless you pass --show-copies-as-adds.
If the copy has local mods, you see a diff between the copy source and
the modified copy. With --show-copies-as-adds, you see every copy as added.

Note that --show-copies-as-adds was added so that svn patch can see
copied files (without knowing that they are copies, it treats them
just like any other added file). Once svn patch understand copy/move
information in the diff header this option will be needed much less.

I think it's fine for now to make copy/move info in the diff header behave
according to the above:

 - Print just a copy header but no content diff for copied without mods:
   Index: somefile
   ===================================================================
   copy from <copyfrom path>
   copy to <somefile>

   Index: nextfile

 - Print a diff as well if the file was modified after copying:
   Index: somefile
   ===================================================================
   copy from <copyfrom path>
   copy to <somefile>
   --- [...]
   +++ [...]
   @@ [...] @@
   [...]

(I've used [...] to denote an omission).

- With --show-copies-as-adds, print copy/move info and also show
  the copy as a diff that only adds lines:
   Index: somefile
   ===================================================================
   copy from <copyfrom path>
   copy to <somefile>
   --- [...]
   +++ [...]
   @@ [...] @@
   + [...]
   + [...]
   + [...]

I think you can handle all the above cases in the file_added() callback
in the diff editor.

> Suggestion on solutions:
>
> 1) Call file_added() for all paths with db_status copied/moved
> instead of using file_changed().
> + Allows us to easily detect what files are adds and the
> svn_wc_diff_callback already has a copyfrom arg.
> - We still have to distinguish between copied and moved paths in
> the callback.

We could chose to only show copies in the git headers for now.
And defer moves till wc-ng can handle them properly.

svn patch will need to make sure to copy files before deleting the source.
This will probably require a two-pass approach when patching...
But again, once wc-ng can tell us whether a file was moved or copied,
this will get much easier.

> 2) Introduce file_copied() and file_moved() callbacks. Those could
> have a 'delete_half' arg or similar, e.g. we would pass both the
> add-half and del-half to that function and separate them with an
> parameter.
> + Might enhance readability.
> - We're introducing a concept that may be viewed as inconsistent
> with the rest of the code. svn handles 'add+del', is it
> acceptable to introduce copy and move at this level of the code?
> Further, the diff_file_added() and diff_file_deleted() does very
> little as is, wouldn't it be better to introduce some extra
> complexity there instead?

Aren't you changing public diff-editor API by adding those functions?
If so, I don't think it's a huge problem to rev the API. But if we
can avoid doing so I'd prefer not to.

> Problem 2
> =============
> We need to process the add-half to know the delete-half. The add-half
> knows about the delete-half through the copyfrom information, but the
> other way around does not work - there's no 'copyto' information. Since
> we don't want to present a diff for the delete-half, we need to be able
> to know that a deleted file actually is part of a delete. Suggestion on
> solutions:
>
> 1) Create a custom svn_wc__db_read_children() func that can has an out
> parameter callled 'copyto', e.g. the func would traverse the entire
> wc looking if there is any other paths that has the current path as
> copyfrom.
> + We can make the detecting operation less expensive than a file
> traversal since we have SQL and the db file will already be in
> memory. (Will involve more reads before we move to one db).
> - It feels a bit awkvard to fetch 'copyto'. There must have been a
> design decision to only only make the copyfrom relation
> unidirected. How does other code parts detect the copyto?

In general, there is no copy-to information in svn at all at the moment.
wc-ng will eventually provide this information. The DB schema supports
copied-from/copied-to and moved-from/moved-to.
But I don't think we record anything but copied-from yet.
So I'd just wait until wc-ng records the extra information we need,
and then generate extra information in the diff.

> 2) Store added and deleted paths as keys in two hashs with the diff
> contents inside diff_content_changed().
> + We don't have to change directory_elements_diff().
> - The memory consumption would grow with the number of deleted and
> added paths. If we choose to save them to tmp-files, we could
> avoid that but would suffer in performance instead.

We store deleted paths in batons in quite a few places. E.g. the merge
code does this to tell apart path replacements from additions.
If a similar scheme works here I'd say let's go for it. But again,
waiting until wc-ng matures enough is probably a better idea.

> 3) ...
>
> Problem 3
> ==========
> If we do changes in the callchain presented earlier, we would affect
> the 'url->wc' use case too. directory_elements_diff() is called from
> close_directory() and close_edit(). We would have git diffs for -r
> BASE:WORKING but no git diff headers for the revisions preceeding BASE.
> I'm assuming that it's not a problem since the current 'svn diff'
> command shows copies for the changes in the wc but not the ones before.

For now, we only care about the wc-wc case. So diffs that involve
revisions preceding BASE are out of scope.

Stefan
Received on 2010-05-30 11:17:43 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.