[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Moves in FSFS

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Wed, 11 Sep 2013 18:46:41 +0200

On Wed, Sep 11, 2013 at 5:21 PM, Julian Foad <julianfoad_at_btopenworld.com>wrote:

> While discussions continue about the "editor" and the WC side of move
> tracking, I'd like to make some progress on the Repository side.
>
> The Wiki page
>
> http://wiki.apache.org/subversion/MoveDev/MoveDev#Move_Semantics
>
> declares the semantic model and
>
> http://wiki.apache.org/subversion/MoveDev/MovesInFSFS
>
> attempts to specify the changes needed. I think the model for move
> semantics in the repository as presented there is pretty solid --
> precise and complete and having a nice set of properties.
>
> Can anyone review this? If anyone can take it forward in any way that
> would be a great help, otherwise I'll tackle the implementation
> myself.
>
> One issue that may be harder than it sounds at first is the concept of
> 'node-line-id' rather than (node-id, copy-id) as the basis of the
> definition. The point is that when we copy (ordinary copy, not move)
> a directory, we lazy-copy the children, which means each child keeps
> its old (node-id, copy-id) unless and until it is modified. That's
> great for achieving the O(1) copy, but for move-tracking purposes each
> child needs a unique "node-line-id" so its life-line can be uniquely
> traced forward and back between this revision and a later revision by
> which time it may have been modified and thus assigned a new copy-id.
>
> Clearly it would defeat the O(1) cost if we were to construct a
> node-line-id explicitly for every node in the tree at copy time. Can
> we instead define node-line-id such that we can compute it as needed,
> from either an unmodified lazy-copied child or after such a child has
> been modified, and get the same answer? Or perhaps re-state the
> problem to avoid this need?
>

Hi Julian,

I'm currently bogged down in svnlive prep work but here is my quick
feedback;
more to come next week.

Bottom line: looks ok, especially the API seems fine and performance in f7
should be acceptable even if it is O(changes in [rA .. rB]).
General observations:

* We need a format bump for the extra "M" entries in the changed path
  lists, potential "lazy" markers in the tree etc. But that is not a problem
  as the log-addressing branch probably gets merged in about 2 weeks
  time and bumps to f7 anyway. It also brings the infrastructure for
  "mixed addressing" such that we may introduce extended structures
  in existing repositories without touching existing revisions.

* Existing copy&del pairs will not be treated as move since the node-line-id
  does not match. Maybe, we can add some intelligence to 'svnadmin load'.

* A copy effectively destroys all move relationships below it. That seems
  unfortunate (say, you duplicate a project) but the solution to that would
  probably require hierarchical IDs ("match IDs within the context of this
  sub-tree").

* Support for resurrection of deleted nodes *without* destroying any move
  relationship is potentially expensive but I think we should support this
  early on (maybe not in 1.9 but def. in 1.10). People just happen to delete
  their /trunk once in a while and you don't want to tell them that *now*
they
  actually managed to break something ...

  Proposal: Resurrect keeping the old node-line IDs, iff
  (a) the copy source (or a parent) got deleted in the next revision
  (b) no copies of that node (or any parent) were added since the source
rev.

  That should keep normal copying relatively cheap and still provide the
  special behavior for our "undo" use-case.

I'd like to implement that - after some in-depth more review and would even
be willing to postpone the cache-server feature to 1.10 because move
tracking
is much more important atm.

-- Stefan^2.
Received on 2013-09-11 18:47:17 CEST

This is an archived mail posted to the Subversion Dev mailing list.