collaborating

From: Tom Lord <lord_at_regexps.com>
Date: 2002-10-10 22:06:05 CEST

> We decided we could use your help a long time ago.

Interesting.

> Have you decided to [help]?

Let's find out.

It might be worth some time to work first on the meta-issues first
such as:

* scope
* participants
* forum
* process

in order to avoid wasting one another's time by going further into this
with wildly different expectations. While the default rule is usually
"code talks; chat walks", the issues I can best help with are not best
addressed by my submitting a series of patches since any code changes
in these areas ought to be preceded by some broadly-scoped design
work; a directionless email exchange is likely to be impractical and
inefficient, if not downright ineffective.

Perhaps some of the discussion of these meta-issues would be better
conducted off-list.

Let me speak briefly to "scope", relating it to current events in the
industry, because that may start to simplify questions about
participants (initial and eventual), forum, and process.

Lately there has been a little brouhaha about the BitKeeper license
and its potential impact on linux kernel developers. Let's step back
and look at that more broadly.

At the root of the hubbub is the issue of _engineering processes:_
their impact on economics, and their impact on freedom.

The linux kernel, nowadays, has considerable economic importance to an
impressive list of companies. Given the size of the project (code
size and participant group size) and the distribution of labor (wildly
cross-corporate and involving independent volunteers), the engineering
processes which are used to organize kernel development have inherited
considerable economic importance of their own. Those processes
directly impact the bottom-line costs of every GNU/Linux vendor, the
speed and accuracy with which they can compete against other systems
and support their customers, the degree to which they can differentiate
among themselves, and the degree to which those vendors can truly reap
the practical benefits of sharing source code with one another.
Judging by the anecdotal reports, the switch of a number of the most
active core maintainers to BitKeeper has been a major improvement from
the perspective of all of the vendors using the linux kernel.

The kernel is not unique with regard to these engineering process
issues. Another project (notably GCC) has, over the past 5 years or
so, made significant process improvements of its own. The Apache and
Mozilla projects were organized with process considerations in mind
from the start.

Across all of these projects, the processes are far from perfect -- in
no small part because of limitations of the software tools being used
-- but the large _value_ of effective processes (and, hence, good
tools) is clearly recognizable.

The difficulties created by the recent BitKeeper license change run
deep. It was already problematic that developers who highly value
their freedom could not agree to use BitKeeper -- but now the problems
are far more tangible. The new license explicitly seeks to fragment
the free software developer community and to interfere with its
progress on new software. It has already accomplished that goal. That
effect is the _stated intention_ of BitMover. BitMover established
leverage in the free software world by buddying up to some kernel
maintainers, and is now using that leverage to impede progress in
other free software projects. Is anyone, aside (arguably) from
BitMover, benefiting from that trade-off?

So, while svn may have started off with the simple and self-contained
goal of making a modern CVS replacement, I think the scope of the
project ought to be reconsidered. The entire commercially significant
free software world has a pressing need for software tools that can
better define, organize, and support its engineering processes.
Effective revision control is a keystone element of such a tool-set.

From my perspective, the larger scope has some obvious implications
for the feature set of a revision control system. Beyond that, the
scope implies that we ought to think about process tools other than
just revision control, and how all of the pieces can be made to fit
together manageably and well -- we need a better assembly line, not
(merely) a better conveyer belt.

The svn project has been developing for quite a while now without
reaching 1.0. More recently, `arch' popped up and, very quickly, with
very little code, presented a system that is clearly at least
competitive in terms of features and tractability. The two systems
have different goals and accept different constraints. They have
different strengths and weaknesses. Both, however, are candidates
for addressing "the BitKeeper problem".

I propose an intense and focused goal and design review, and goal and
design restatement, encompassing both of our projects, and looking
beyond them to the other tools in a software engineering tool-chain. I
think that now is a good time for us all to step back, take stock of
what we have and what we need, and to make a solid plan for execution
moving forward.

Such a scope is not purely technical. The planning activity itself,
and the work that may result from it, have considerable economic
value. If a review process can be initiated, it ought to run in
parallel with business planning and business activity. These are the
primary issues I wish to take off-list.

Stop-and-collect design reviews and plan reformations have a long and
hardy tradition in industry. They can be quite fun, having the form
of a brainstorming session that gives way to detailed and inspiring
planning, then execution. I think the time is ripe for such activity
in the free software revision control and process engineering tool
projects. It is in _that_ area that I now invite you to solicit my
help.

Enclosed is a little position paper I've been circulating. It, too,
is far from perfect -- but it may help to map out what I have in mind
when I talk about the broader context of process engineering tools in
general.

-t

The Process is the Solution:
Envisioning a Better Free Software Industry

lord@emf.net, lord@regexps.com,
510-825-7915, 510-657-4988

        You may freely redistribute verbatim copies of this
        document, but changing it is not permitted. Please
        send comments and suggested revisions to the author.
        If you would like to distribute a modified version,
        please contact the author to ask for permission.

INTRODUCTION: THE "AHA!" EXPERIENCE

  My objective with this document is to produce an "Aha!" experience
  in you: a shared understanding of some software engineering issues
  in the free software world, and the business issues that relate to
  them. In this document, I lay out a practical program for reforming
  the engineering practices in the free software world, explain
  informally why that program makes sense, and point to the business
  opportunities this activity can create.

  Rationalizing free software engineering is wise: it alleviates some
  serious RISKS that are accumulating in the free software world, and it
  can lead to products of far greater quality. It has the chance to
  be lucrative: the engineering practices I advocate directly attack
  the now famous "IT ROI" problem in a focused way -- they suggest a
  new approach to serving customers effectively by moving engineering
  attention and effort closer to their individualized needs.

  Here's the form of this document: First, I will lay out six
  technical goals, as bullet points. Second, I will explain those
  goals in more detail: elaborating on what they mean, and why they
  are good goals. Each subsection in the second part contains a
  definition for the bullet point, and a rationale. Third, I will
  give my recommendations for next steps.

  By the end, if you have the "Aha!" experience, you should have in
  mind the beginnings of a picture of a reformed Free Software/"Open
  Source"/unix industry. You'll be able to start thinking about how to
  make it so. You'll be able to begin to articulate why it is such a
  good idea. If you're anything like me, you'll find it at least a
  little bit exciting, invigorating, and inspiring.

SIX GOALS FOR OUR INDUSTRY

1) Build a public testing infrastructure.

2) Build a public release engineering infrastructure.

3) Make standardized, efficient, cooperative forking the default
behavior for all free software providers.

4) Design large customer installations as locally deployed development
sites.

5) Organize businesses and business units that join moderately sized,
   regionally identified sets of individual consumer and small
   business customers into tractable markets for support, treating the
   support service providers in those markets as "large customer
   installations".

6) Simplify GNU/Linux distributions; reduce the core code base size;
reduce the out of control package dependencies; focus on essential
functionality, quality, and tractable extensibility.

ELABORATIONS ON THE SIX GOALS

1) Build a public testing infrastructure.

A public testing infrastructure consists of:

1) Software standards for configure/build/test tools

        2) Internet protocol standards for scheduling tests,
           delivering source code to test servers, and retrieving
           results.

3) Public servers, implementing those protocols.

4) Implementation of these standards for critical projects.

RATIONALE

   Developers and vendors alike need to be able to identify
   configurations of their packages for regular automated testing
   on a variety of platforms. Developers especially: the cost of
   quality rises the farther downstream you try to implement it.

   Few if any independent developers can afford to build their own
   non-trivial test infrastructure; at the same time, the cost of
   such an infrastructure large enough to serve many developers is
   quite affordable to large companies.

   This is a business opportunity: to build and maintain that
   infrastructure. It makes sense for the big vendors to pay for this
   infrastructure, and for a neutral, smaller vendors to build and
   maintain it.

   A good test infrastructure ought to be carefully designed. There's
   the opportunity here for a critical social hack: Let's suppose you
   agree with the overall thesis of this document: that, generally,
   process and infrastructure reform is desirable in the free software
   world. One practical implication is that many individual projects
   will need to reform their makefiles, beef up their test suites, and
   so forth. That creates a social a challenge of getting the
   anarchic world of developers to buy into that reform. That reform
   can simplify and lower the costs of a testing infrastructure. At
   the same time, access to the testing infrastructure can be very
   valuable to developers everywhere. So -- standardizing the source
   management infrastructure of one's project can be a condition of
   entry to gain access to the testing infrastructure; it can be a
   little maxwell's daemon for bring source code discipline to the
   existing anarchy.

2) Build a public release engineering infrastructure.

A public release engineering infrastructure consists of:

1) Software standards for configure/build/test/install tools.

2) Software standards for source code layout.

3) Software standards for automating package dependency
management.

4) Software standards for construction and installation
logging and auditing.

5) Implementation of these standards for critical projects.

RATIONALE

   Anecdotally: one of the best things that happened to BSD (back in
   the 80s) was a regularization of the whole system's build/install
   tools. Essentially, systematically regularizing all of the
   Makefiles gave small programming teams much greater leverage -- it
   made the source code system far more tractable by eliminating
   hundreds of tiny annoyances and inefficiencies. One could, for the
   first time, change a core library and rebuild the world over a lazy
   afternoon.

   Nowadays there's a RISKS benefit, too. Suppose an isolated customer
   site needs to make some global change to their software system
   (say, an emergency repair to a core library; the replacement of
   one of the core shell tools or daemons). Right now, if they run
   GNU/Linux, they can not do it cheaply, if at all. If they can do it,
   they will almost certainly wind up with a system that today's
   package managers (e.g. RPM) can no longer cope with.

   There's another RISKS benefit, too: eliminating single points of
   failure. Having all the world's binaries originate in only 1-5
   shops is just plain bad practice. With a few thousand shops,
   malicious producers have lesser impact; all shops have the
   opportunity to maintain their own trusted bootstrapping paths;
   binary releases can differentiate sufficiently to help reduce the
   potential impact of worms and other attacks.

   This also helps with a public testing infrastructure. A
   standardized configure/build/test/install infrastructure makes the
   construction of an automated testing infrastructure far more
   tractable.

3) Make standardized, efficient, cooperative forking the default
behavior for all free software providers.

"Standardized, efficient, cooperative, forking" means:

1) There is no authoritative central site for critical
components. Instead:

2) There is a global namespace for well-known base revisions,
and:

3) There is a global namespace for well-known change sets,
and:

        4) Any source tree can be fully identified as the combination
           of well-known base revisions plus well-known change sets,
           and:

5) Bugs and other issues are cataloged in a global namespace,
and:

        6) The globally named entities described above are stored,
           redundantly and with verification, in a substantial number
           of widely distributed databases, and:

        7) The formatting of change sets, including associated
           "meta-data" (such as descriptive logging, and logging in
           relationship to well-known bugs) is standardized in ways
           that facilitate the development of advanced patching and
           merging tools, and:

        8) The principle unit of source code exchange among all of the
           developers and development teams is not tar bundles of
           source trees, but rather, change sets, and:

        9) Developers encourage the production of change sets which
           are for a single purpose, well logged, and well documented,
           and:

        10) Software and hardware resources are deployed to enable
            formal "signing off" on change sets, after independent
            review, by multiple parties.

RATIONALE

   The 10 technical points outlined above are the essence of how to
   decentralize project maintainership -- to do away with "superstar"
   maintainers and replace them with a cooperative, anti-authoritative
   effort.

   By way of an example: IBM, HP/C, and Sun now each have multiple
   teams doing GNU/Linux development. We ought to make those teams
   far more efficient, far more precise, and thus far less risky.

"efficient" and "precise" -- what do those mean in this context?

1a) Making vendor free software teams efficient

   Solitary project maintainers (whether individuals or committees)
   are a political, practical, and economic bottleneck -- single
   points of failure in a number of ways. They are vulnerable to
   burn-out, egotism, incompatible agendas, irrational agendas,
   bribery by malicious parties, infiltration by malicious parties,
   overload, incentive to extort, and loss of critical project state
   in the event of ceasing to be the project maintainer. While they
   remain the filters through which progress is made, they impede
   efficient and reliable progress. Projects, teams at the big
   vendors, and users should not be beholden to "superstar" project
   maintainers.

   The points above help to replace solitary superstar maintainers (or
   maintainer committees) with a decentralized system of _locally
   authoritative_ "gatekeepers".

   A gatekeeper is a maintainer who is responsible for the contents of
   _one instance_ of a project; one who decides based on local
   criteria which changes to incorporate, and at what rate.

In a decentralized world, each system vendor would have their own
gatekeeper for each project.

   Large and sophisticated customers also ought to have their own
   (possibly out-sourced) gatekeepers as well. It's ok if those
   customer gatekeepers are mostly passive (just accepting the latest
   from their favorite vendor). What's critical is that (a) they are
   prepared for detached operation should it become necessary; (b)
   they have a tractable option for site-specific customizations
   supplied by multiple vendors. See (4), below.

1b) Making vendor free software teams precise

   The current paradigm of free software development, particularly in the
   applications space, has been called "The Magic Caldron" -- but
   could be called (to put the problem in extreme terms) "The
   Unaccountable Magic Caldron".

   The Unaccountable Magic Caldron model happens when, every N months,
   a project on the net makes a new tar bundle and various vendors
   decide if it is "stable enough" to make it into the distributions.
   That model is just a big "kick here" sign for vendors: an
   invitation to malicious code; an invitation to project disruption;
   even an extortion and bribe-taking opportunity for independent
   project leaders. It's liability inducing (since it creates such
   obvious risks). And it's just Bad Practice: not all that different
   from the famous 70's and early 80's in-house practices in which
   "Gurus with low badge numbers" were able to butch their employers
   around (and sometimes get them into big trouble) by hoarding
   control over projects that nobody else really understood.

   The opposite of the Unaccountable Magic Caldron is, um, let's call
   it Change-based Source Management. Established projects should
   publish, sure, a definitive "head" version -- but more importantly:
   a set of clean, well-documented changes that vendors can
   independently and pro-actively review and test, change by change.
   Rather than believing in the myth of "many eyeballs", magically
   vetting source code for free, the "many eyeballs" phenomenon should
   be instituted as a formal practice.

4) Design large customer installations as locally deployed development
sites.

This goal represents a radical shift in "what customers buy".

A "locally deployed development site" consists of:

1) A repository for all source code used at the site.

2) A repository for all source code change sets deployed
at the site.

        3) Software tools and hardware resources sufficient to
           rapidly construct a complete development environment: one
           in which all of the source code is available for
           modification and redeployment.

4) Complete and accurate logging of all software deployed at
the site, and modifications made locally at the site.

        5) Software and hardware resources sufficient for testing
           fresh builds of the software deployed at the site,
           including tests designed to measure conformance
           to site-specific invariant requirements.

RATIONALE

   If you buy a big HVAC system, your (competent) facilities manager
   is going to be sure he's got the ductwork plans, the site-specific
   analysis reports, the schematics, a stock of critical tools and
   parts, and robust supply chains for parts and service.

Source code should be no different. Every large customer should
have the five items listed above.

There's lots of growth opportunities here, too as the purview and
capabilities of IT management and system administration expands.

   Why should big customers take on these new costs? Aside from the
   risks management aspects, there's the new and very winning
   opportunity it creates for better site-specific support and better
   site-specific customization.

Why is site-specific customization a big deal? Because of the now
infamous "ROI" problem that plagues, for example, IT software.

   People debate the degree to which technology has genuinely
   increased productivity and one hears plenty of painful stories
   about expensive IT reforms that left companies with apps that don't
   work and workers that hate their apps.

   By increasing the customizability of big IT sites and bringing
   dedicated developers closer to those sites (for which it is necessary
   to bring better source management closer to those sites), we enable
   fine-grained, low-increment-of-investment, site-specific app design
   and deployment.

   In other words, the solution to the IT ROI question isn't to look
   for the ultimate killer app -- the solution is to get engineers on
   site, analyzing what's going on with the technology, and solving
   the real problems in their full and specific context. With a well
   built computing system, a hand-full of 5K-line custom apps can often
   do a lot more to increase productivity than deploying a $gazillion
   kill-app.

   Aside from better source management, that shift in focus requires a
   shift in how generic software is designed. A greater focus on
   preparing for in-the-field customization and extension has a high
   value, in this context.

5) Organize businesses and business units that join moderately sized,
   regionally identified individual consumer and small business
   customers into tractable markets for support, treating the support
   service providers in those markets as "large customer
   installations".

A "support service" in this context means:

1) The company down the street where you buy, if not your
computer, at least your software.

        2) The small team of people who answer the support phone line
           -- people who are competent sysadmins and programmers, whom
           you might come to know by name.

        3) The local service that collects your feedback about
           the system: how well it works for you, what additional
           functionality you'd like to see -- and brings your feedback
           to the larger development community in an effective form.

4) The local service that keeps you informed about the latest
developments and newly available tools.

RATIONALE

   There is little essential difference between a big IT site and,
   say, a small company serving 10,000 GNU/Linux home users in the
   East Bay. This is another growth opportunity in multiple
   dimensions.

   A phone bank of scripted "support" staff is notoriously frustrating
   for users and for support workers alike. The alternative described
   above has worked very well on college campuses for decades, and is
   ripe for conversion to a business model.

6) Simplify GNU/Linux distributions; reduce the core code base size;
reduce the out of control package dependencies; focus on essential
functionality, quality, and tractable extensibility.

The meaning of "simplify" in this context is difficult to sum up
with only a few bullet points. It's meaning includes:

1) Designing small tools that each do one task well, but that
are designed to combine in powerful ways.

        2) Building tools that do _not_ gratuitously rely on a
           "tower of standards" and the many libraries needed to
           implement them.

        3) Designing complete systems that offer all of the
           functionality end-users need, but with as few components
           as possible.

4) Aiming for components which are small enough to be read,
fully understood, and maintained by individuals.

        5) Aim for components, especially user interface tools, that
           are well and conveniently documented, customizable, and
           extensible.

RATIONALE

   All too often, in a rush to "build out quickly" and to "generate
   momentum behind emerging standards", projects and companies these
   days have taken to building "spruce gooses": aesthetically
   enticing systems, hopelessly ambitious in their conception, built
   from the wrong materials, at too great an expense, and not quite
   able to fly.

   Some software architectures are tractable. Others are not.
   There's a lot of bloated, confused, intractable work out there
   these days. The old advice to K.I.S.S. ("Keep it Simple, Stupid")
   is always good to keep in mind.

RECOMMENDATIONS

To turn the sketch into business activity is a sales challenge,
a planning challenge, and a technical challenge.

I'll speak to two out of three of those:

  Technically: I would like to find resources sufficient to complete my
  work on software tools essential for the tasks outlined above.
  Currently I have `arch' -- a distributed, modern, lightweight,
  tractable revision control system that is quite far along in its
  development; and `package-framework', an earlier stage project
  focusing on release engineering and tools for testing.

I would like to spend part of my time managing a small team of
programmers to finish those projects, and polish the hell out of them.

  Planning: I have some techniques I like to apply to large planning
  efforts involving multiple people. Simple but useful organizational
  tricks and presentation tricks. We need spaces, both real and
  virtual, to collect notes, post "maps" on the wall, have informal
  meetings, build a library, and draw up resources and requirements
  blueprints. We need an "intelligence center" for a task of this
  scope.

  I'm interested in managing such an intelligence center. It'd be
  useful for the project above and, beyond that, revisiting other
  architectural aspects of free software. I believe that, if
  properly resourced, I can build such a center as a comfortable and
  inviting retreat from the office park world -- and that's really the
  kind of relaxed atmosphere that a task this serious and broad needs.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Oct 10 22:00:06 2002

This message: [ Message body ]
Next message: Karl Fogel: "Re: collaborating"
Previous message: Matthew W. Samsonoff: "Re: libtool: link: cannot find the library `'"
In reply to: Karl Fogel: "Re: Adding changeset-like functionality to subversion"
Next in thread: Karl Fogel: "Re: collaborating"
Reply: Karl Fogel: "Re: collaborating"
Maybe reply: Paul Lussier: "Re: collaborating"
Reply: Zack Weinberg: "Re: collaborating"
Reply: Nuutti Kotivuori: "Re: collaborating"
Maybe reply: Paul Lussier: "Re: collaborating"
Maybe reply: Paul Lussier: "Re: collaborating"
Reply: Barry Scott: "RE: collaborating"
Maybe reply: Tom Lord: "Re: collaborating"
Maybe reply: Wolf Josef: "RE: collaborating"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]