On Wed, Sep 11, 2013 at 5:21 PM, Julian Foad <julianfoad_at_btopenworld.com>wrote:
> While discussions continue about the "editor" and the WC side of move
> tracking, I'd like to make some progress on the Repository side.
>
> The Wiki page
>
> http://wiki.apache.org/subversion/MoveDev/MoveDev#Move_Semantics
>
> declares the semantic model and
>
> http://wiki.apache.org/subversion/MoveDev/MovesInFSFS
>
> attempts to specify the changes needed. I think the model for move
> semantics in the repository as presented there is pretty solid --
> precise and complete and having a nice set of properties.
>
> Can anyone review this? If anyone can take it forward in any way that
> would be a great help, otherwise I'll tackle the implementation
> myself.
>
> One issue that may be harder than it sounds at first is the concept of
> 'node-line-id' rather than (node-id, copy-id) as the basis of the
> definition. The point is that when we copy (ordinary copy, not move)
> a directory, we lazy-copy the children, which means each child keeps
> its old (node-id, copy-id) unless and until it is modified. That's
> great for achieving the O(1) copy, but for move-tracking purposes each
> child needs a unique "node-line-id" so its life-line can be uniquely
> traced forward and back between this revision and a later revision by
> which time it may have been modified and thus assigned a new copy-id.
>
> Clearly it would defeat the O(1) cost if we were to construct a
> node-line-id explicitly for every node in the tree at copy time. Can
> we instead define node-line-id such that we can compute it as needed,
> from either an unmodified lazy-copied child or after such a child has
> been modified, and get the same answer? Or perhaps re-state the
> problem to avoid this need?
>
Hi Julian,
I'm currently bogged down in svnlive prep work but here is my quick
feedback;
more to come next week.
Bottom line: looks ok, especially the API seems fine and performance in f7
should be acceptable even if it is O(changes in [rA .. rB]).
General observations:
* We need a format bump for the extra "M" entries in the changed path
lists, potential "lazy" markers in the tree etc. But that is not a problem
as the log-addressing branch probably gets merged in about 2 weeks
time and bumps to f7 anyway. It also brings the infrastructure for
"mixed addressing" such that we may introduce extended structures
in existing repositories without touching existing revisions.
* Existing copy&del pairs will not be treated as move since the node-line-id
does not match. Maybe, we can add some intelligence to 'svnadmin load'.
* A copy effectively destroys all move relationships below it. That seems
unfortunate (say, you duplicate a project) but the solution to that would
probably require hierarchical IDs ("match IDs within the context of this
sub-tree").
* Support for resurrection of deleted nodes *without* destroying any move
relationship is potentially expensive but I think we should support this
early on (maybe not in 1.9 but def. in 1.10). People just happen to delete
their /trunk once in a while and you don't want to tell them that *now*
they
actually managed to break something ...
Proposal: Resurrect keeping the old node-line IDs, iff
(a) the copy source (or a parent) got deleted in the next revision
(b) no copies of that node (or any parent) were added since the source
rev.
That should keep normal copying relatively cheap and still provide the
special behavior for our "undo" use-case.
I'd like to implement that - after some in-depth more review and would even
be willing to postpone the cache-server feature to 1.10 because move
tracking
is much more important atm.
-- Stefan^2.
Received on 2013-09-11 18:47:17 CEST