[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Moves in FSFS

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Wed, 11 Sep 2013 19:09:03 +0100 (BST)

Thanks, Stefan!  Very interesting thoughts.  Especially... (scroll down)...

Stefan Fuhrmann wrote:
On Wed, Sep 11, 2013 at 5:21 PM, Julian Foad wrote:
>>    http://wiki.apache.org/subversion/MoveDev/MoveDev#Move_Semantics
>>    http://wiki.apache.org/subversion/MoveDev/MovesInFSFS
>> [...]
>>
>> One issue that may be harder than it sounds at first is the concept of
>> 'node-line-id' rather than (node-id, copy-id) as the basis of the
>> definition.  The point is that when we copy (ordinary copy, not move)
>> a directory, we lazy-copy the children, which means each child keeps
>> its old (node-id, copy-id) unless and until it is modified.  That's
>> great for achieving the O(1) copy, but for move-tracking purposes each
>> child needs a unique "node-line-id" so its life-line can be uniquely
>> traced forward and back between this revision and a later revision by
>> which time it may have been modified and thus assigned a new copy-id.
>>
>> Clearly it would defeat the O(1) cost if we were to construct a
>> node-line-id explicitly for every node in the tree at copy time.  Can
>> we instead define node-line-id such that we can compute it as needed,
>> from either an unmodified lazy-copied child or after such a child has
>> been modified, and get the same answer?  Or perhaps re-state the
>> problem to avoid this need?
>
> I'm currently bogged down in svnlive prep work but here is my quick feedback;
> more to come next week.
>
> Bottom line: looks ok, especially the API seems fine and performance in f7
> should be acceptable even if it is O(changes in [rA .. rB]).
>
> General observations:
>
>* We need a format bump for the extra "M" entries in the changed path
>  lists, potential "lazy" markers in the tree etc. But that is not a problem
>  as the log-addressing branch probably gets merged in about 2 weeks
>  time and bumps to f7 anyway. It also brings the infrastructure for
>  "mixed addressing" such that we may introduce extended structures
>  in existing repositories without touching existing revisions.
>
>* Existing copy&del pairs will not be treated as move since the node-line-id
>  does not match. Maybe, we can add some intelligence to 'svnadmin load'.
>
>* A copy effectively destroys all move relationships below it. That seems
>  unfortunate (say, you duplicate a project) but the solution to that would
>  probably require hierarchical IDs ("match IDs within the context of this
>  sub-tree").

That's a good observation.  Here's an example, to clarify:

  r10: trunk/foo

move foo to bar

  r20: trunk/bar

copy trunk to branch1

  r30: trunk/bar
       branch1/bar

Now request "svn diff -r10:30 branch1".  It would be useful if Subversion could say trunk/foo_at_10 moved to branch1/bar_at_30 in the context of this diff.  (Where I say "diff" we can also substitute "update", "merge", and so on.)

This only makes sense for a copy at or above the root path of the requested diff.  In this example, it makes sense for "diff -r10:30 branch1" and for "diff -r10:30 branch1/bar".  It does not make sense across a copy that happened below the target: in this case, "diff -r10:30 ^/" would NOT be expected to show foo_at_10->bar_at_30 as a move.

One way of looking at this is that our history-tracing that's used to find "-r10 branch_at_30" in such scenarios is *already* following copies at the root of the subtree as if they are moves, and in a way this would be extending that idea.

This seems like functionality that should be provided in a higher layer; the FS layer just needs to provide some primitive queries to make this possible.  I'm not sure what, exactly.

>* Support for resurrection of deleted nodes *without* destroying any move
>  relationship is potentially expensive but I think we should support this
>  early on (maybe not in 1.9 but def. in 1.10). People just happen to delete
>  their /trunk once in a while and you don't want to tell them that *now* they
>  actually managed to break something ...

Yes.

>  Proposal: Resurrect keeping the old node-line IDs, iff
>  (a) the copy source (or a parent) got deleted in the next revision
>  (b) no copies of that node (or any parent) were added since the source rev.
>  That should keep normal copying relatively cheap and still provide the
>  special behavior for our "undo" use-case.

I think you mean that the user-level copy should do this automatically.  Perhaps so.  That need not be implemented in the FS layer; the FS layer could just provide the primitives necessary to implement it.

> I'd like to implement that - after some in-depth more review and would even
> be willing to postpone the cache-server feature to 1.10 because move tracking
> is much more important atm.

Wonderful!

- Julian
Received on 2013-09-11 20:10:01 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.