[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: branching several times a day

From: Nuutti Kotivuori <naked_at_iki.fi>
Date: 2004-03-22 23:34:42 CET

Brad Appleton wrote:
> On Sun, Mar 21, 2004 at 12:39:17AM +0200, Nuutti Kotivuori wrote:
>> Then again - "cvs update -j" is used for *merging*.
> Yes - that is the usage I meant (sorry about that).

Ah, okay. Then it all makes sense.

>> In the original context where I used that expression it is all
>> pretty much the same. The whole point is that after a commit to the
>> mainline, there are no changes in the branch that are uncommitted -
>> so "rebranching" is only to let the version control system know
>> that further changes are against the mainline at that point, and
>> not against the last edit on the previous change done on the branch
>> - there are no changes to merge, just metadata to inform.
> Right! (thank you for stating it so clearly). So what I'm seeing is
> that while SVN does indeed support change-sets, there is an implicit
> assumption by "svn log" that historical information about a branch
> starts with the beginning of the branch,

Right. Or by default it actually assumes that the historical
information starts from the beginning of the entire codeline - and you
have to specifically tell it to not follow branches to stop at the
beginning of the branch.

> and while there may be revisions of a branch that correspond to
> change sets, the notion that multiple successive revisions on the
> branch may be part of a larger task that are "delivered to the
> trunk" as a whole, doesn't really transport across branches.


> The trunk knows that the stuff committed to it since the previous
> trunk revision is all one group. But the branch from which those
> changes came has no knowledge of that. The branch just knows about
> commits to itself, not bundles of changes-sets merged from/to some
> other branch.

Correct. Even more specifically, the trunk even doesn't know that a
bundle of change-sets are merged to it - it just gets a commit of
edits and doesn't care what created them. So the information that the
commit is a result of a merge operation pulling some change-sets from
a branch needs to be described in the log message, if that information
is to be retained.

> So I hear you telling me that the only way to tell a branch that it
> should "reset" it's history-logger after it's been "resynced" with
> the codeline is to explicitly "rebranch" it so that it thinks its
> branch-off point is the new-tip of the trunk instead of a previous
> one.


> The other interesting thing is that the branch name apparently is
> not associated with the revision on the trunk that resulted from
> merging that branch to the trunk. So folks that created a branch for
> a specific task in their tracking system (e.g. bugzilla or Jira or
> scarab, etc.) who may have used the task-id in the branch-name,
> often expect that branch-name (containing the task-id) to somehow
> "live on" in the history of the trunk once the changes are merged to
> the trunk.

Again correct. The only way for the branch-name to live on is to
record it in the log message, so people can trace it later.

> It seems they don't get that right now, based on a recent request I
> saw about wanting the ability to create some kind of "alias name"
> for a revision number that is separate from the notion of a "tag"
> (which I can fully understand, tho I agree calling it a "tag" rather
> than a revision alias would cause confusion, or perhaps a "Name"
> property of a revision, if properties can be associated with a
> revision)

Yes, that's right.

As a sidetrack from this - to implement smart merging, one usually
needs to record what has already been merged, so it does not get
merged again. As a side product of this tracking, it obviously means
that branch-name (or something similar) must be recorded in the
history of trunk, along with the actual change-sets that were
merged. So in a way you were right earlier when you said that smart
merging would fix this - as a side effect, yes - it would also record
what has been merged where, so it doesn't have to be written in the
log messages.

>> Is there a reason why one should not delete a private branch after
>> a commit to the mainline, and recreate it when starting a new one?
> Why recreate the branch rather than create a new branch by a new
> name?

Well, keeping the name was mostly just to show that it is possible -
in practise, there isn't often a reason to keep the name.

> If you delete it and then recreate it, I would say that you are, in
> essence, still using the private branch pattern, where the
> rebranching is part of a tool-implementation specific tactic for
> "delivering" changes from the private branch to the trunk that also
> tells the "delivering" branch it is now "resynced" as far as history
> is concerned.

Yes, obviously it is the way things are often done in Subversion. To
be even more exact, the deletion is to tell the branch that it is
"delivered", and the recreation is what happens at the start of the
new task to tell that the changes are against trunk at that
point. There might be some time and some commits on trunk between
these two events.

But it might as well be said that creating a new branch at the start
of a task and deleting it at the end of it, only keeping the same name
is the native private branch pattern, and re-using the old branch is
just a tool-implementation specific tactic to avoid a cubersome
re-branching operation.

> If I create a new branch for a new task instead of reusing (via
> deleting and then rebranching with the same name), then it is either
> because I still want the branch-tag to be associated with the
> delivered set of changes and their history, or because I still want
> the associated storage or sandbox. It seems I can't get the former
> (the delivered changeset => name association) without the latter.

Well, neither is exactly correct. In any case, the reference to the
delivered set of changes and their history should be recorded in the
log message on the commit to the trunk. For example, "merged 354:364
from /branches/naked-private-task to /trunk". This information will
never disappear or change, so it will forever be assosiated with
it. Keeping the branch around and not deleting it will only mean that
it is more easily located, since it is still alive at the HEAD
revision of the repository. But even that is not too useful, since the
branch does not know when it was merged to trunk - or even, if it ever
was - that information is only in the history of the trunk.

So, if you want to keep branch-tag as a mnemonic for the delivered set
of changes and their history, you *tag* the trunk revision at which
the changes were merged to the trunk. Then people can look at the
change-set produced at that revision to see the actual change - and
look in the log message of that revision to see where to find the
entire set of changes and their history as they were in the branch.

And as for the associated storage or sandbox - in Subversion it is
trivial to change the branch that a sandbox is associated to, while
keeping local edits (unfinished changes) in the sandbox. You can even
create a new branch from the changes (and versions) that exist right
now in your sandbox. So there is no need to take sandboxes into
account at all when deciding whether to re-use the same branch-tag or
to come up with a new.

>> I mean, sure, you can keep on editing in the same branch, you can
>> use revisions to mark ranges, you could build support into
>> Subversion to record the latest commit and only show diffs and logs
>> until that and smart merging will save your ass if you merge
>> changes twice - but why bother? Is there something to be gained by
>> all that?
> The reason for doing that instead of a new-branch per task would be
> if I could still get the benefits of task-based change-set grouping
> without having to create so many additional branch-names and
> associated "copies". If I could reuse the "copy", and let the "name"
> be associated with revision resulting from porting the changes to
> the trunk, then I'm still creating new names (that get associated
> with trunk revisions) but I'm not creating new branches/tags nor
> their associated copies, and the version tree has a much simpler
> structure, and I'm still doing a whole lot less copying (even tho it
> is a lightweight operation, its still not as lightweight as not
> doing it at all)

Well, yes, by re-using the branches the version tree has a much
simpler structure in that not so much branching and tagging is going
on. But on the other hand the logical structure of the version tree is
more complex, since a single branch has several successive tasks, with
lulls in between possibly, instead of just having small branches with
short lifetimes that get merged back to trunk, having only one task
per branch. Also, you seem to be assuming that copying would be a more
heavyweight operation than, say, changing a line in a file - other
version control systems avoid copying by doing other changes, such as
recording branch-tags somewhere, or merge history - so you cannot say
that copying costs anything, since the alternative cannot be not doing
anything at all.

So it seems to me to be avoiding some cost that just isn't there.

> Plus there is a different between being lightweight, and being
> "perceived" as lightweight. For some folks, no matter how well you
> explain to them that branching in SVN is inexpensive+lightweight, it
> still seems conceptually "heavyweight" to them. And even if the
> implementation of doing so is lightweight, it will still seem
> conceptually more complex (as will having to delete+rebranch,
> something they would regard as an indication of additionally
> technical "residue" required by the additional conceptual
> complexity, for something they see as not being necessary in the
> first place).

Well yeah, a lot of people have emotional baggage left over from
previous version control systems that can be hard to deal with.

But for a person who doesn't have such baggage, it is conceptually
very straightforward to operate by: when you want to make a change to
the mainline, branch from the mainline - when you have committed
(merged) the changes to the mainline, delete the branch.

> But I think the answer to your question is that with SVN, what you
> described (deleting+rebranching) is really just part of the SVN
> implementation of using a "private branch" (assuming you are still
> using the same branch-name when you rebranch). I had thought you
> were doing something more like the "task branch" pattern, and
> wondered why not "reuse" the existing branch rather than create a
> new branch by a new name. It sounds like that's not what you're
> doing.

Actually, I joined this conversation half-way, to try and clear up the
facts on what Subversion can do and what it can't do right now - so I
don't have anything I really am doing. And this sidetrack here was
just because I was wondering *why* should one reuse the existing
branch, be it a private branch or a task branch.

> In SVN, because branches and tags are both "copies", it can be
> difficult to discuss branching and labeling concepts in ways that
> treat both as separate/separable things

Yeah, it doesn't translate as easily to other version control systems
(or concepts) as a more traditional model would.

> (and that is essentially one of the main differences between a
> private branch and a task-branch. One reuses the work-stream and the
> work-space, while the other creates a new work-stream and workspace
> for the purpose of keeping the branch0name around as a mnemonic for
> the revision corresponding to the delivered change-set)

Well, like discussed above, the situation isn't exactly the same with
Subversion. So the difference mostly boils down to just how you name
the branch and how you end up using it.

-- Naked

To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Mon Mar 22 23:35:16 2004

This is an archived mail posted to the Subversion Users mailing list.