[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Definition of Node-Line-ID

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Tue, 17 Sep 2013 01:19:15 +0200

Hi there,

After two hours of analysis, it seems that I have found
the correct definition for the "node line ID" as required
by Julian's move API design:

NodeLineID(Path,Rev) := LastCopyTarget_at_LastCopyRev

where LastCopyRev is the latest copy that involved
Path_at_Rev (or any of its parents). LastCopyTarget is
the path to which path got copied to in LastCopyRev.
If a node has not been copied, LastCopyRev is the
revision that this node got added.

Note that possibly LastCopyTarget != Path in case
that moves in (LastCopyRev, Rev] changed the path
(or any of its parents).

Properties of NodeLineID as defined above:

* defined for any path_at_rev in the repository
* exactly one IDs for the same path_at_rev
* not defined for any path_at_rev that does not exist
  in the repository

* is unique for all paths at any one revision
* does change [exactly] when path or any of its parents
  gets copied and only for the copied tree
* does *not* change when path or any of its parents
  gets renamed / moved

I.e. LastCopyTarget_at_LastCopyRev assigns exactly one
ID to each line of non-copying node history in the repository.
All alternative definitions will should, therefore, be equivalent
to the one given here.

In particular, the following cannot simply be used:

* (nodeId, copyID) since there are usually fewer nodes
  in the DAG than there are paths @HEAD alone
  (approx. 1:2 for apache.org)
* replacing the current lazy copying with deep copies
  would create nodes exactly for all
  LastCopyTarget_at_LastCopyRev values, i.e. is equivalent
  but less space-efficient

* anything involving parent paths or IDs because a move
  to a different parent must not change the node line id.

The given definition even has nice computational properties:

* LastCopyTarget_at_LastCopyRev can be directly expressed
  as such in a developer readable string
* The code to find the LastCopyRev is part of the standard
  history following code and relatively efficient
* For all entries in the same directory_at_rev, the effort can be
  lowered to O(1) in many cases since the respective parents
  have already been investigated (depends on internal node-ID
  assignment rules).

-- Stefan^2.
Received on 2013-09-17 01:19:49 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.