On Saturday, May 18, 2002, at 10:28 AM, Mark C. Chu-Carroll wrote:
> Actually, there's a whole lot to gain by integrating build support into
> the SCM system.
Well, that depends on what your goals are. The reason CVS and Subversion
are so useful is that they know as little as possible about the contents
of the files and directories they are versioning. The make a distinction
between text and binary files and do simple line-ending conversions.
That's it. As a result, you can use them to version just about anything.
(Side note to Karl: I think this is a good example of the
worse-is-better philosophy. I too am wary of it, because it has a short
half-life and quickly degrades to easier-is-good-enough if you're not
careful. But every now and then you run into a case like this!)
If I understand correctly, Stellation takes the opposite approach, and
parses out the contents of the source files in order to do more
intelligent diffing and patching. That does afford some advantages, but
it also limits flexibility as you can't work with datatypes the system
So I can see how integrating build support into Stellation could be a
win. You're already commited to understanding the semantics of the
source tree. But with Subversion (and CVS) you lose a lot.
> The main factor is that by building things into the system, you
> can do automated work that would be intractable for a human build
> For example... Andreas Zeller did some work with changeset based
> systems on identifying problems. The basic idea is that a bunch
> of programmers all checked in changes. Then, during the nightly
> build, you discover that the system no longer compiles correctly,
> or that it no longer passes the standard tests.
> But there's 30 changesets. That means 30 tests to determine if
> one of those changesets is the one that broke the build. But what
> if what breaks the build is a *combination* of the changes in more than
> one changeset?
> Zeller's system did a binary-search like process to try to determine the
> minimal group of changesets that cause the breakage.
> A system like that could be implemented outside of the system; but
> it's a
> heck of lot nicer to tie enough of it into the system that it can be
> fully automatically.
I have two somewhat contradictory responses to this.
On the one hand, I'd say that this system is not significantly better
than the one used by the Subversion team. Instead of concentrating the
task of detecting problems at one point, that responsibility is
distributed among the developers, each of whom is responsible for
ensuring that his changes don't break anything. Since commits are
atomic, each developer doesn't have to worry about a combinatorial
explosion of changeset permutations. He just has to make sure that his
changes, when applied to the current state of the tree, don't break
At the same time, build breakage is probably the easiest to detect and
easiest to fix of all the possible problems a changeset could introduce.
To really detect problems you need to test the behaviour of the software
once it's built.
The Subversion team does this with a suite of automated tests developed
along-side of Subversion its self. Each developer ensures that his
changes not only don't break the build, but also don't break any of the
tests, again avoiding the need to test changeset permutations. This type
of automated testing can't be built into the version control system
because it's too domain-specific. So much so that it's part of the
project being versioned.
On the other hand, where automated testing along the lines Zeller
proposes *is* useful, it's quite possible to build it on top of a
version control system that knows nothing about the build process. The
svn-breakage mailing list is a good example. Various machines with
different CPU architectures and operating systems do automated
checkouts, builds and tests of each changeset and mail the results to
svn-breakage. It's simple, effective and flexible.
Zeller's system or the user-work/user-commit system you propose could be
implemented on top of Subversion as easily as within it.
> Another major issue that you can address with integration of build with
> repository is product cacheing. If you're working with a really big
> system, builds take a *long* time. And lots of programmers are doing
> builds to test their own stuff. All of the programmers are spending a
> of time idle waiting for build results. (At one point, I was involved
> with the VisualAge C++ project. Builds of the system could take two or
> three hours. We'd spend half of our work day waiting for builds to
> But almost all of that build time is redoing the exactly same work on
> many different systems. We had about 7 people in our building, working
> a part of a 4 million line system. Each person would be changing one or
> two source files, and then waiting for the build result of those changes
> compared with the nightly builds.
> If the SCM system understands the build process, it can store the
> intermediate results, and before starting a build step, check if anyone
> else has either done that already, or is in the process of doing it. At
> the least, you avoid a lot of redundant builds; at the best, you get
> build parallelization.
Again, this is properly part of the build system rather than the SCM
system. It doesn't require the integration of the two. Heck, the easiest
way to achieve this would be to check in the object files along with the
source. As long as all the developers are working on the same platform,
you could do it with CVS or Subversion today!
To unsubscribe, e-mail: firstname.lastname@example.org
For additional commands, e-mail: email@example.com
Received on Mon May 20 00:01:24 2002