[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svnmover feedback

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Tue, 5 May 2015 22:00:52 +0000

Julian Foad wrote on Fri, May 01, 2015 at 14:24:06 +0100:
> Daniel Shahaf wrote:
> > Julian Foad wrote on Thu, Apr 30, 2015 at 10:30:39 +0100:
> > > ...
> > As an alternative, element ids could be made unique only throughout
> > their branch and all branches that are copy-wise-descendants of the
> > copy-wise-primogenitor of the branch that contains them — so, for
> > example, /subversion/(trunk|branches/*) are one "element id space", but
> > but /httpd/httpd/(trunk|branches/*) would be (together) another "element
> > id space".
> >
> > After all, merging from httpd/trunk to subversion/trunk
> > isn't defined; letting them have distinct element-id spaces would
> > model that undefinedness. (I'm not sure how this compares with
> > "branch families", which had been eradicated before I took a close look
> > at the branch. Let me know if I just rehashed something that's been
> > discussed already.)
>
> Indeed, that was exactly the "branch families" model. Originally the
> segregation seemed to be necessary in order to model subbranches, but
> it is no longer necessary for anything. It is not necessary to have
> segregated sets of EIDs in order to recognize that 'http' branches
> have nothing in common with 'subversion' branches: the lack of any EID
> appearing in both is sufficient.
>
> Non-segregation is useful for certain scenarios such as if the user
> decides to combine previously separate branch families into one
> family. This is demonstrated in
>
> svnmover_tests.py 11: restructure repo: projects/ttb to ttb/projects
>

Another use-case for global EIDs: moving a file from mod_dav_svn (in
svn) to mod_dav (in httpd).

> If we want to model segregation of separate projects, this can be done
> on top of the single-family model, whereas the reverse would not have
> been possible.
>

I'm convinced :-)

> >> Another significant property of a branch in this model is the
> >> one-to-one correspondence between the (instances of) elements in this
> >> branch and those in another branch, for the elements that appear in
> >> both of the branches.
> >
> > I'm not sure I understand. Are you saying that element ids allow us to
> > easily answer the question "What has element X on this branch been
> > renamed to on that branch"? If so, then yes, it does, but how does it
> > handles bifurcations [...]?
>
> Bifurcation (splitting and joining) is deliberately not handled
> specially -- only in the same way that it is now, by the user choosing
> at most one 'tine' of the 'fork' to be the successor and the other(s)
> to be plain copies. I decided modelling bifurcation explicitly at this
> level would be too complex to justify for the relatively rare cases
> where it could be useful.
>
> The first level of complexity is reached as soon as you realize that
> 'splitting' is not a one-to-two relationship but a many-to-many
> relationship (because you might split the same thing on two different
> branches) and thus leads to a concept of groups of related elements,
> and you have to choose what kind of relationship they should have --
> perhaps a flat 'set' relationship, or a hierarchical relationship like
> the tree formed by the existing 'copying' relationship. And then
> justify why should it be different from what the 'copying'
> relationship gives us.
>
> In fact, the 'copy' relationship might be the *right* way to represent
> splitting. Bear in mind that 'copying' will no longer be needed for
> branching or merging-an-add or resurrection, as all of these are
> handled explicitly by the new model.
>

Sure, there is no need to invent two kinds of copy operations, one for
bifurcations and one for everything else. I think the main question is:
should the data model support efficiently the operations that a "merge
through copies" functionality¹ would require? That functionality itself
(up in libsvn_client) needn't be implemented right now, but we may want
to design the data model with an eye towards possibly implementing that
functionality in the future.

Basically, while we're inventing a new data model anyway, would it be
feasible to make it support cheap computation of copyto? So the mv-t-2
branch introduces a data model that supports move tracking (via EIDs)
and cheap copyto computation, and implements move-aware merge tracking
on top of that data model, but doesn't implement any features on top of
the cheaply-available copyto information.

¹ To be explicit, the use-case I have in mind is: change a file on
trunk, split it on branch, merge the trunk mod to branch, have the
change propagate to both copies on branch of the file.

> > and I don't see what use-case it serves
> > other than users who accidentally created their contents in ^/ rather
> > than in ^/trunk.
>
> That's an intentionally served use case. The suggestion of requiring
> people to have designated /trunk as a branch *before* creating any
> contents in it was considered very bad for compatibility with
> historical usage. This model has no such restriction.
>

That's not quite what I said, but as you said, it's a UI concern.

The parts I snipped I agree with. Thanks for the answers and output
snippets.

Daniel
Received on 2015-05-06 00:01:34 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.