[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Moves in FSFS

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Wed, 11 Sep 2013 19:09:03 +0100 (BST)

Thanks, Stefan! Very interesting thoughts. Especially... (scroll down)...

Stefan Fuhrmann wrote:
On Wed, Sep 11, 2013 at 5:21 PM, Julian Foad wrote:
>> http://wiki.apache.org/subversion/MoveDev/MoveDev#Move_Semantics
>> http://wiki.apache.org/subversion/MoveDev/MovesInFSFS
>> [...]
>> One issue that may be harder than it sounds at first is the concept of
>> 'node-line-id' rather than (node-id, copy-id) as the basis of the
>> definition. The point is that when we copy (ordinary copy, not move)
>> a directory, we lazy-copy the children, which means each child keeps
>> its old (node-id, copy-id) unless and until it is modified. That's
>> great for achieving the O(1) copy, but for move-tracking purposes each
>> child needs a unique "node-line-id" so its life-line can be uniquely
>> traced forward and back between this revision and a later revision by
>> which time it may have been modified and thus assigned a new copy-id.
>> Clearly it would defeat the O(1) cost if we were to construct a
>> node-line-id explicitly for every node in the tree at copy time. Can
>> we instead define node-line-id such that we can compute it as needed,
>> from either an unmodified lazy-copied child or after such a child has
>> been modified, and get the same answer? Or perhaps re-state the
>> problem to avoid this need?
> I'm currently bogged down in svnlive prep work but here is my quick feedback;
> more to come next week.
> Bottom line: looks ok, especially the API seems fine and performance in f7
> should be acceptable even if it is O(changes in [rA .. rB]).
> General observations:
>* We need a format bump for the extra "M" entries in the changed path
> lists, potential "lazy" markers in the tree etc. But that is not a problem
> as the log-addressing branch probably gets merged in about 2 weeks
> time and bumps to f7 anyway. It also brings the infrastructure for
> "mixed addressing" such that we may introduce extended structures
> in existing repositories without touching existing revisions.
>* Existing copy&del pairs will not be treated as move since the node-line-id
> does not match. Maybe, we can add some intelligence to 'svnadmin load'.
>* A copy effectively destroys all move relationships below it. That seems
> unfortunate (say, you duplicate a project) but the solution to that would
> probably require hierarchical IDs ("match IDs within the context of this
> sub-tree").

That's a good observation. Here's an example, to clarify:

r10: trunk/foo

move foo to bar

r20: trunk/bar

copy trunk to branch1

r30: trunk/bar

Now request "svn diff -r10:30 branch1". It would be useful if Subversion could say trunk/foo_at_10 moved to branch1/bar_at_30 in the context of this diff. (Where I say "diff" we can also substitute "update", "merge", and so on.)

This only makes sense for a copy at or above the root path of the requested diff. In this example, it makes sense for "diff -r10:30 branch1" and for "diff -r10:30 branch1/bar". It does not make sense across a copy that happened below the target: in this case, "diff -r10:30 ^/" would NOT be expected to show foo_at_10->bar_at_30 as a move.

One way of looking at this is that our history-tracing that's used to find "-r10 branch_at_30" in such scenarios is *already* following copies at the root of the subtree as if they are moves, and in a way this would be extending that idea.

This seems like functionality that should be provided in a higher layer; the FS layer just needs to provide some primitive queries to make this possible. I'm not sure what, exactly.

>* Support for resurrection of deleted nodes *without* destroying any move
> relationship is potentially expensive but I think we should support this
> early on (maybe not in 1.9 but def. in 1.10). People just happen to delete
> their /trunk once in a while and you don't want to tell them that *now* they
> actually managed to break something ...


> Proposal: Resurrect keeping the old node-line IDs, iff
> (a) the copy source (or a parent) got deleted in the next revision
> (b) no copies of that node (or any parent) were added since the source rev.
> That should keep normal copying relatively cheap and still provide the
> special behavior for our "undo" use-case.

I think you mean that the user-level copy should do this automatically. Perhaps so. That need not be implemented in the FS layer; the FS layer could just provide the primitives necessary to implement it.

> I'd like to implement that - after some in-depth more review and would even
> be willing to postpone the cache-server feature to 1.10 because move tracking
> is much more important atm.


- Julian
Received on 2013-09-11 20:10:01 CEST

This is an archived mail posted to the Subversion Dev mailing list.