[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Symmetry for branching, move tracking and merging

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Tue, 17 Mar 2015 14:06:45 +0000

I (Julian Foad) wrote:
> * uniformity of the difference from branch1_at_r1 to branch2_at_r2
> for any values of: branch1, r1, branch2, r2
> where branch1 and branch2 are 'related' (formally: in the same branch family)

Branko Čibej wrote:
> The pure-difference is called 'svn_repos_replay'. [...]
> And of course there's 'svnadmin dump' [...]

No, those are interfaces to the repository model, the *linear*
sequence of snapshots of the whole repository contents.

I'm talking about the branching model, the *tree* of snapshots of the
contents of "a branch" of (a directory-tree of) a project's files.
When I say 'difference' I'm talking about the kind of difference that
we need as input for a merge: a difference between any two path-revs
pathX_at_rX and pathY_at_rY on related branches.

>The pure state ... isn't that just 'svn checkout'?

A clean checkout of a branch may contain all the information that
comprises the logical state of the branch, but it also contains a lot
of other information -- metadata identifying *which* logical state of
which branch was checked out (URLs, revs, last-changed info, etc.) and
information specific to the WC (local paths, time stamps, depths, and
so on) -- and the dividing line between the two is not explicit.

>> To really see what's missing, we need to focus on the difficult parts
>> which include:
>>
>> * mergeinfo
>
> I'm not at all sure that mergeinfo is even part of the versioned state,
> never mind current implementation. If it is, we're certainly not
> modelling it correctly. But we know that.

Agreed.

>> * copies (what to do with 'copy-from' information)
>>
>> * move tracking
> [...]
>> The underlying problem, I suggest, is that we haven't clearly defined
>> how copying fits into the model of 'versioned state' and 'difference'.
>
> Well, there's an alternative interpretation ... that your model of
> 'versioned state' and 'difference' is incomplete. :) I can't think of a
> different way of recording copies (as opposed to moves!) than saying,
> "this is a new node which was created from [predecessor noderev_at_revision])".

This is extremely challenging to answer. That's good. I need you to
challenge me on this -- try to explain it forces me to understand what
I really mean.

As a thought experiment, here is a rough description of a very
different behavioural model of copying. The goal here is to think
outside the current constraints, imagining what *could* be, before we
bring the constraints back in and settle on what *can* be.

  * set up for the example ...
    - r1: create 'trunk'
    - r4: create 'trunk/foo'
    - r6: branch 'trunk' to 'branch'

  * r12: copying 'trunk/foo_at_10' to 'trunk/bar' ...
    - looks up 'trunk/foo_at_10'
    - finds it is an instance of element 31 of branch family 1 (let's
write <f1.e31>)
    - creates a new element <f1.e55> of the same kind (let's say it's a 'file')
    - tags <f1.e55> as 'copied from <f1.e31>'
    - creates an instance of <f1.e55> at 'trunk/bar', with the same
content as 'foo_at_10'
    - does not record a specific revision, path, branch or instance of
the copy source

  * r15: merging 'trunk' (everything up to r12) to 'branch' (base is r13) ...
    - creates an instance of <f1.e55> at 'branch/bar'
    - content of 'branch/bar' = 3-way-merge(YCA=trunk/foo_at_12,
left=trunk/bar_at_12, right=branch/foo_at_13)

What do we want to 'see' in the branch? One visible effect of the
'copied' flag in this model is that merge sets the content of
branch/bar not just from trunk/bar (as happens today) but also from
branch/foo. (I think that is better, but that's not the point at the
moment. The point is just to explore in what ways a different
behavioural model might be visible to users.)

Another effect that we might want to see, and certainly expect to see
coming from the current design, is that some kind of 'history' command
such as 'svn log' should show that 'bar' is a copy. What we see in the
existing model for a 'log -vq' is:

$ svn log -vq -c15
r15 | julianfoad | 2015-03-16...
Changed paths:
   M /branch <-- the mergeinfo
   A /branch/bar (from /trunk/bar:12)

That says branch/bar was copied from trunk/bar, which it was in the
existing model. But in this putative model that we're exploring, it
should say something different. Perhaps:

$ svn log -vq -c15
r15 | julianfoad | 2015-03-16...
Changes merged to branch 'branch' from branch 'trunk':
   A 'bar' (a copy of 'foo')
   [details: 'bar' is <e55> which is a copy from <e31> which exists at 'foo']

The only other area where a copy has special behaviour is in tracing
the history of 'an object'. In the existing model, the history of
'branch/bar_at_15' consists of 'history segments' which are:

  branch/bar : 15 - 15
  (gap)
  trunk/bar : 12 - 12
  (gap)
  trunk/foo : 4 - 10

The putative model might define copy-history as gapless and as staying
within the branch as long as possible. Expressed as path history
segments:

  branch/bar : 15 - 15 # content is a merge of trunk/bar_at_12 & branch/foo_at_13 ...
  branch/foo : 6 - 14 # ... but copy-history is defined as continuing
right up to r(15 - 1)
  trunk/foo : 4 - 5 # here 'branch' and 'trunk' have a shared history

Anyway, that was all just to illustrate how it is conceivable that
there could be a different model of 'copying' that is not based on a
copy-from path and a copy-from revision.

More to follow.

- Julian
Received on 2015-03-17 15:13:55 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.