Re: branching several times a day (was Re: Sourcesafe user needs primer on branching source control)

From: Brad Appleton <brad_at_bradapp.net>
Date: 2004-03-12 20:08:37 CET

On Fri, Mar 12, 2004 at 11:18:12AM +0100, Stefan Haller wrote:
> Brad Appleton <brad@bradapp.net> wrote:
>
> > So if you primarily work on one task at a time, you have a
> > single branch all to yourself. When you are done with your
> > change (and after you have "updated" from the main trunk)
> > you "commit" your change to the main trunk.
>
> I'm not sure I really understand what you mean. Are you
> saying you would first merge from the trunk to your branch
> (the changes that other people have committed to the trunk
> in the meantime), and then merge back from your branch to
> the trunk?

Yes. This is a commonly recurring standard best practice in
most VC tool "communities" where the tool has decent branching
support (i.e., not VSS :). In CVS and SVN the "update" command
does this. In ClearCase/UCM it is called rebase (short for
"rebaseline"), I think Perforce uses "sync". Bitkeeper calls it
"pull". Other tools call it "import" or "merge-in".

The idea is that, you are about to commit your changes to the
codeline. if other changes have been committed to the codeline
since you started your change, then your sandbox is not "up to
date" with the latest "good" state of the codeline. Hence if you
commit your changes, you will have potential inconsistencies and
even merge-conflicts to reconcile, and you may "break" the build
of the codeline. If you break the build, it impacts the whole
team because none of them can commit their changes now either.

So the prevailing wisdom that has emerged says, find a way to
test the result of my changes + the codeline such that if the
result fail, it only impacts me and my sandbox and not the result
of the team. There are two ways to do this:

A) Don't use "Latest-and-Greatest"!
----------------------------------
Instead, only use the most recently "blessed" (e.g. promoted)
baseline (label or tag). This offloads a lot of merge and
build work and resultant labeling+promoting to a buildmeister
and/or build-blesser. Having changes that are not "in sync"
with the latest stuff becomes increasingly more common and
it takes increasingly longer for builds to be blessed and for
the codeline and sandboxes to be "in sync".

The upside is that it is easier to isolate the set of
changes that you had to make, because you don't have to
checkout/merge/add any files/lines for changes that you had to
merge-in from elsewhere. If your VC tool has decent support
for being able to figure out which changes were REALLY made
by you and which ones were simply carried-forward by you,
this is less of a traceability concern.

OTOH, it might be easier to "reuse" the un-synced changes
in your workspace to "propagate forward" into a subsequent
parallel supported release. (Then again, it might not be any
easier, and could even be harder).

B) Update your Sandbox to Keep Current
--------------------------------------
Use latest-and-greatest. Do an update as often as desired
when there are new commits to the codeline. Keep your sandbox
(and branch) in sync with the latest state of the codeline so
that you don't have a "big bang" merge at the end of your task
and have to reconcile a maximal number of changes and your own
rework efforts. Instead do regular, frequent, and incremental
integration into your own sandbox so you only merge small and
easy chunks at a time, and decrease the amount of time and the
likelihood of occurrence that the codeline may be broken and that
you will have to do major rework before committing your changes.

The upside is that frequent incremental integration helps keep
everyone current and reduces the size and complexity of merge
conflicts and eases their reconciliation. It also minimizes
the window of time between when you are ready to commit your
changes and when you have finished committing them and have
verified the result is still consistent/correct.

The downside is your branch contains lots of changes that were
carried forward by you but not necessarily made by you. Again,
this is more of a traceability concern. Some would say it also
makes it harder to "subtract" the added functionality from the
codeline if desired at a later date - and this is true to some
extent. At the same time, following this practice decreases the
likelihood that it will be necessary as well as the likelihood
that a change will "break the build" (whereas if you haven't
done it, and you here about this, you worry about how to undo a
broken build because you are more used to it happening because
you don't sync as frequently - a bit of a catch-22)

So which is best?
=================
In general, most small and medium projects prefer the Frequent
Incremental Update approach - what I call "Continuous Update"
in my article "Codeline Merging and Locking: Continuous Update,
Two-Phased Commits" in Nov'03 CMCrossroads news at:
<http://www.cmcrossroads.com/newsletter/articles/agilenov03.pdf>

Larger projects, particularly those that have dedicated
build-meisters that typically don't let developers commit their
own changes tend to eschew the "Latest-and-Greatest" and insist
on using static, formally identified/blessed labels. It is
more careful and controlled but also adds a lot of development
"friction" and wait-time at the benefit of reducing the cost of
rework by preventing the "big merges" (rather than amortizing
them over small frequent chunks :-).

In the end, both are different risk-management approaches that
have their own appeal to their own audiences. There is "pay
now" (the static baseline), and there is "pay later" (don't
use anything and wait till it burns you), and there is "pay
as you go" (the frequent and disciplined use of incremental
integration, even during one's own change-task).

However, I have noticed in the last 5 years that more and more
shops are leaning toward developer "push"-style integration
(allowing developers to merge/commit their own changes), and
requiring them to rebase-before committing. To mitigate the
risk, they use what I call a "docking line" and the developers
push ("dock") their changes to this "active" development line,
and then the SCM/Build folks can preview/audit the stability
of what is there before deciding to "pull" the "docked" changes
from the active development line over to the mainline or
release-line branch.

I personally find that in my experience, the more frequent and
more incremental approach gives better overall stability and
suitability PROVIDED that developers are disciplined about
making sure their stuff works and won't-break-the-build before
merging it and learn how to successfully merge, and generally
do a good job of using encapsulation and modularity in their
coding. It also means "code ownership" (e.g. of a module/class)
can not be "exclusive" but is more like "stewardship" than
ownership (exclusive code ownership makes it difficult to do
this, and forces a more sequential-locking approach, and more
"wait-time" for the code-owner to make the changes you would
otherwise get their help on when reconciling merge conflicts).

Good design, discipline and collaboration keep codelines
consistent, correct and coherent, and make LATEST-AND-GREATEST
with continuous/hyperfrequent integration+updates be very
effective and HIGHLY productive. If you don't have all three
of those things and continuous (the encapsulation/modularity,
the discipline to test what you have to ensure you don't break
the build, the ability to collaborate well to resolve merging
concurrent changes) then you break something for either
the SCM/Buildmeisters, or the QA/V&V, or the code-owners,
and ultimately for project management. In those cases the
formal static baselining and throw-it-over-the-wall "pull
model" of integration is more rampant, and takes more time,
but gives more reliable quality results (and results in more
adversarial relationships between those competing roles).

For more info on the "Docking Line" pattern, you can see the
two sets of powerpoint slides from previous RUC conference
presentations I've given at:
http://acme.bradapp.net/#ClearCase

For more info on "Active Development Line", "Release Line"
and "Mainline" patterns you can see the "SCM Patterns" book
(www.scmpatterns.com) and also see precursor descriptions
of them in a rather comprehensive (and lengthy :) branching
best practices paper at:
http://acme.bradapp.net/branching/

For more info in particular on "Continuous Update" and
several companion practices that accompany it, see the
aforementioned paper on codeline merging and locking
http://www.cmcrossroads.com/newsletter/articles/agilenov03.pdf

It talks about the following dozen or so locking-related
practices and the circumstances (context) in which each is
appropriate to use. Alternatives range from no locking and
a single integration machine, to an integration token, to
various forms of codeline locking.

Continuous Workspace Update
* Workspace Update
   + Post-Commit Notification
* Private Checkpoint/Versions
   + Private Archive
   + Private Branch
   + Task Branch
   + Checkpoint Label

Two-phased "Commit"
(where the commit "transaction" is viewed as having two phases:
  a commit-phase, and a "preparation" phase that consists of:
  rebase+reconcile, rebuild+retest, resolve)
* Pre-Commit Validation
* Codeline Locking (and factors of team-size, build/test-time,
   parallel tasks, likelihood of collisions/conflicts,
   commit-duration and overlap)
   + Single Release Point (e.g., single integration machine)
   + Integration Token
   + Codeline Write-Lock
      - Full Codeline Lock
      - Partial Codeline Lock
      - Double-Checked Codeline Lock
      - Phased Codeline Lock
   It discusses appropriate context for the locking patterns
   based on the above mentioned factors.

All of those locking-related patterns are successfully recurring
solutions in common practice. But the context is important. Use
a pattern in the wrong context, and at best you might simply
be doing more than you really need to (at worst you could really
foul things up).

Hope that helps!!!

-- 
Brad Appleton <brad@bradapp.net> www.bradapp.net
  Software CM Patterns (www.scmpatterns.com)
   Effective Teamwork, Practical Integration
"And miles to go before I sleep." -- Robert Frost
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Received on Fri Mar 12 20:09:09 2004

This message: [ Message body ]
Next message: Jeff Lanzarotta: "RE: Subversion 1.0/Mandrake 9.2/Apache2"
Previous message: John Peacock: "Re: Subversion 1.0/Mandrake 9.2/Apache2"
In reply to: Stefan Haller: "Re: branching several times a day (was Re: Sourcesafe user needs primer on branching source control)"
Next in thread: N. Thomas: "Re: branching several times a day (was Re: Sourcesafe user needs primer on branching source control)"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]