Re: Aegis (fwd)

From: Brian Behlendorf <brian_at_collab.net>
Date: 2000-05-20 00:15:07 CEST

No need to study this for anything particular to the svn design, I just
wanted to forward it here to see if it triggered any ideas about things,
though mostly as I read it I kept saying to myself, "sounds like we've
solved that in a better way" and for various other reasons why not going
with Aegis was a good idea.

Brian

---------- Forwarded message ----------
Date: Thu, 18 May 2000 13:22:48 +0000
From: Peter Miller <millerp@canb.auug.org.au>
To: Jonathan S. Shapiro <shap@eros-os.org>
Cc: dcms-dev@eros-os.org
Subject: Re: Aegis

I think this got trashed in the mailing list snafu (leastwise, I
never saw the list spit it back at me) so I'm sending it again.

"Jonathan S. Shapiro" writes:
> WORKFLOW:
>
> > > Aegis wants to take over your entire
> > > lifecycle model, and I don't want that.
> > [Aegis] does model the development lifecycle of a change set...
>
> I should preface this by saying that I find chapter 2 of the Aegis guide
> incredibly off-putting. This could be purely a problem with me as a reader.
> The best way to summarize my issue is that the word "must" appears in far
> too many places.

This is interesting feedback. This chapter presumes a number of
project configuration settings, and the "must" usually reflects
those (admittedly, unstated) settings. Would a statement at the
start of each section (dev, rev, etc) as to those settings (and
where to set/change them) be sufficient?

> The model itself seems like a good one, but I see no reason that it needs to
> be hard-coded into the tool.

With the benefit of hindsight, neither do I. However, messing with
Aegis' state machine is pretty major surgery this late in the game.
Also, the sheer number of boundary conditions in the validations
for each state transition (e.g. aegis/aede.c) makes me *very* wary
of scripting the whole thing. In particular, how to get a good
security story for a user-defined state: but if you supply re-canned
rules, you wind up with something rather similar to what is there
now.

Also, see below, towards the end, for adding additional states.

> I don't agree that this is the only valid
> workflow for change sets,

Agreed. Aegis' model is simply very general; and at times slightly
inapplicable *because* of that generality. But inapplicable chunks
of language compilers don't stop us from compiling our code, we just
never use those options. Many of Aegis' features can be turned off or
ignored, too.

> and I have been involved in projects that (a)
> eventually came to use the workflow described and (b) would have failed if
> they were required to use it from the start.

This is a common comment. What I don't inderstand is this: why, because
of the starting point, do folks reject the destination they say they
wish to reach?

The starting point in this case isn't so different to where many folks
are at present: If the developer doesn't bother to aede for long periods
of time, OR if the developer starts by saying ``aecp .'', then Aegis'
model (from a developer's day-to-day perspective) collapses to something
very similar to a CVS work area.

> The reviewer notion makes sense. This is, in essence, an "endorsement"
> model, which is conceptually similar to a very old idea from Xanadu. It
> works quite well, but it should be optional (as, apparently, it is).

(Aegis is based on very old ideas, tho I haven't heard of this Xanadu.)

Actually, this is the place I expected you to object to. Many
users have asked if I could move the review state around all over
the map. Usually this is because they are in an org which has very
very long lived work areas (something I don't encourage, but Aegis
doesn't prevent) which result in the need for intra-development reviews
as well as post- and post- post- reviews.

> I confess that I can't tell the difference between "developer" and
> "integrator".

Actually, the integrator role serves the integration state which
arises out of the atomicity requirement. Also, when I was first
writing Aegis, I worked in an org which had zillions of problems detected
at integration time - changes were *frequently* rejected at this stage.
So, the integrator role -can- act as a second review, or a kind of
"reviewer reviewer" job.

Those problems actually arose out of that org NOT having some of Aegis'
interlocks. The mandatory requirements for a successful build and a
successful test would have caught most of them. So much so that many
sites use the integrate_q.sh script included in the Aegis distribution,
which simply runs integrations each night from cron: if there are hassles,
the changes are returned to the developer for rework.

Differences? The developer role can change things, the developer
is the ONLY role which can change things. The integrator role can
veto or pass things, they CANNOT change anything. It is possible
(and common) to allow a developer to integrate their own work: the
developer list and the integrator lists overlap, and the
developer_may_integrate field is set to true.

> I imagine a world in which two branches may have differing integrity
> requirements. As a developer, I have a work in progress. I may wish to
> "check in" to that branch several times. Sometimes, there are reasons to
> check in a state on a developer branch that doesn't compile (if only for
> archival or transfer purposes). In short, this is a branch with no integrity
> guarantees. At the same time, I envision branches with high integrity (what
> your document appears to call a "baseline"). Such a branch requires all of
> the process described in your workflow, and possibly more.

The integrity requirements of a branch are local to the branch (inherited
from the parent until overridden). They can be messed with on one branch
without affecting another branch - until (if) you end the branch.

If you don't have a ``successful build'' requirement, change the
build_command to read "exit 0" (which isn't very useful) or
"make ... || true" which is more likely to be useful.

If you don't want tests, set default_test_exemption = true;
and that interlock won't be required.

> ACCESS CONTROL:
>
> > > I'm partial to cryptographic solutions here, as they provide the most
> > > reliable means of identity that I know about
> >
> > Aegis has this - configure it to use PGP or GPG or whatever.
>
> The Aegis user's guide emphasizes strongly that Aegis uses the UNIX access
> control model, and I am unable to reconcile this with the "use PGP/GPG"
> idea. Perhaps the documentation is out of date, or perhaps I am missing
> something. I'ld appreciate it if you might expand on this.

There are several levels here, and I assumed you were speaking of the
Internet case. I'll expand...

One of the design elements of Aegis is very unix-like: if functionality
exists elsewhere, don't re-invent it. You can see this with the
build tool, the history tool, etc. This has been a huge advantage,
because most developers do things utterly differently to me, and this
flexibility means they can (and do!) do it their own way.

Security and user authentication is one such non-reinvention.
Why invent yet another buggy inadequate limited security system when
there are already so many to choose from?

Thus, Aegis uses the security services offered by the operating system.
It is showing its history that it uses uid/gid so much - however it would
be possible to identify users in a different way without major semantic
repercussions throughout the code (it's reasonably isolated into its own
data structure). The ability to impersonate users, however, is essential.

Back to those levels I mentioned...

1. The Isolated Machine

If you can't trust the security offered by the local operating system,
you may as well give up. Even with cryptographic file systems, you have
to trust the local operating system for the whole time it's mounted.
Even with Plan9-like per-process mounts, if you can't trust the local
operating system, then your processes can be snooped, and the per-process
mounts don't help. Thus, Aegis trusts the local operating system's
security.

The commonest function is for Aegis to ask who the current user is
(and later, to impersonate them). This happens *after* the operating
system has authenticated them. Granted, username/passwd is not a
strong authentication scheme - if your OS has something better, great.
Aegis isn't involved at user authentication time, and doesn't want to be.

(Memories of hideous Oracle user admin discontinuities still haunt me.
I'd rather not re-invent that particular disaster.)

2. The Isolated LAN

Aegis on a LAN mode makes some assumptions, again, about the operating
system. It assumes that user identities are the same between cooperating
machines on a LAN (it remembers the user name, not the uid). It assumes
files (the ones it cares about, anyway) have the same name and are in fact
the same file, on all cooperating machines. That is, there is no
centralized database server (like, say, ClearCase) but that means Aegis
assumes file locks work between machines.

Granted, NFSv2 security leaves a great deal to be desired, and then some.
This is another thing Aegis doesn't reinvent: there are already plenty
of buggy inadequate limted network file systems to choose from. If you
are using a more secure more capable networked file system than NFS,
great; Aegis doesn't need to be recompiled.

Watch out for those interoperability issues, tho.
Aegis supports the concept of heterogeneous development.

3. The Big Bad Internet, telecommuting, etc

Communicating between repositories is accomplished by aedist. It
provides simple building blocks, to which may be added all sorts of
other things: email, web, ftp, PGP, etc.

This is what I was replying to, originally. Personally, I would
recommend folks use PGP or something like it, especially if they
have change sets traversing the Internet or POTS. (Again, Aegis
specifically does not implement its own protocols, or its own
authentication, or its own encryption. There are plenty of bad...
you get the idea.)

> DISTRIBUTION:
>
> The aedist mechanism appears to be designed to ship change sets from one
> group to another, but it appears to assume that the parties are working at
> arm's length. What I'm looking for is a way to mutually cross-synch the
> repositories. This should not involve an additional build/review cycle,
> because the sending repository has already done this.
>
> Am I missing something here?

Yep: Change sets are not commutative.

The "mutually cross-synch the repositories" expectation usually comes with
an assumption that it is an automatic, hands-free operation. Unless one
of the two repositories is completely passive, this just isn't possible.

(This is what intrigues me about CODA, by the way. They have manual
resyncs for this very reason.)

Rather than implement something which was going to generate huge numbers
of "Aegis did a shitty automated repository merge" bug reports, I left
it out completely, because there is an alternative...

1. incoming email is authenticated and decrypted by PGP (auto)
    1a. if fails, toss it
2. change set is uppacked, built and tested (auto)
    2a. if fails, usually due to -merge- issues, let human fix it
3. this space intentionaly left blank :-)
4. change set is now "awaiting review" (manual)
    4a. if fails, let human decide what to do (email, phone, fix it
        themselves...)
5. cron job automatically integrates change sets which have passed review.
    5a. if fails, let human decide what to do (email, phone, fix it
        themselves...)

but there is another possibility in the decision tree ...

3. if it is from our sister repository (as established by PGP
authentication), do an automatic review pass. Skip to step 5.

All of this relies, of course, on actively using Aegis' regression
test facilities, to establish that an incoming change set doesn't
break anything *locally* (even tho it doesn't break anything where
it came from).

> This should not involve an additional build/review cycle,
> because the sending repository has already done this.

What if something changed that invalidated one of the .o files?
You can't just import the .o files, that may drop *local* functionality.
(Imagine two intersecting-but-not-conflicting change sets passing each
other headed in opposite directions.) What if they are BSD-Sparc and you
are Linux-ARM? You can't just import the .o files, they aren't
useful. {And then there's the bandwidth.}

> > > Finally, we need a
> > > system that can let two companies set up a shared workspace with some
> > > reasonable story for security.
> >
> > Aegis has this. (Including secure private repository replication and/or
> > updates over the Internet.)
>
> If so, I failed to see it.

Aedist. See immediate previous point.

> In particular, what happens when I create a branch in my repository that
> happens to have the same name as a branch in your repository? How is the
> collision resolved during synchronization?

(a) It isn't a collision. Both repositories assume you mean the same branch.
(b) The branch won't be created automagically in the other repository.
    Because branches are "big changes", it will only automatically
    propagate when you aede...aeipass the branch itself - if the parent
    is configured to do so.

> More generally, I see no description of how naming is handled in the
> repository.

Naming of what, exactly? You seem to use the term to include more things
than the names of files.

> Given the possible use of RCS, it's not clear to me that Aegis
> controls the names at all.

You can control filenames using a variety of fields in the project
config file filename_pattern_accept, filename_pattern_reject,
maximum_filename_length, etc, etc. In addition the review cycle
is an additional filename crontrol (like it "controls" the rest of
the content, too).

> In the absence of universally unique names I
> don't see how to get a collision-free synchronization mechanism...

If the names in the two repositories never intersect, you don't need to
sync them. By definition, they can't get *out* of sync, except in the
most trivial of ways. Use an FTP mirror program.

Given that they *must* intersect for there to be any point, you are right,
there is absolutely no way to implement a collision-free synchronization
mechanism. Change sets Are Not commutative, as I mentioned above, tho
you can get lucky for long periods of time. Anyone who says they have
such a mechanism is pedaling snake oil.

> > > We need
> > > a means to have branches that are tightly controlled by the core
> > > developers,
> >
> > Aegis has this.
> >
> > > and simultaneously a means for any Tom, Jane, or Harry to make
> > > changes in a local version for later submission into the pot.
> >
> > Aegis has this.
>
> Can you expand on this? How would Tom, Jane, or Harry go about this?

1&2. The local machine and/or the LAN

The list of developers for a branch is local to that branch. They are
inherited from the parent branch on creation, and may be specialized or
diversified after that time.

Thus, you could have a project trunk which permits you and me,
only, as developers. On creating of a branch, we could expand the
list of authorised developers of the new branch to include tom,
jand and harry, without at all affecting the access list for the
trunk.

3. The Big Bad Internet

They are two utterly separate repositories, with utterly separate access
lists and policies. However: when an incoming email hits your intray,
you can make decisions. See above. One decision can be: if you aren't
on my list, go away.

> Also, how is the "config" file, which appears to be per-project,
> turned into something that is per-branch?

Think of branches as a version numbering system, rather than a
physical parking place. One file, multiple versions.

The config file is, in many ways, "just" a source file. If it is present
in a branch, it is used as the authority for that branch. If it is not
present in a branch, it is "inherited" from the parent branch.
Thus, a branch may override the config file of its parent.

Also, just to be confusingly complete, projects and branchs are almost
interchangeable terms in Aegis, but not quite. In general, if a project
can be configured in some way (staff, policy, etc) so can a branch.

> REMOTING
>
> The Aegis code does not appear to make a client/server split between the
> repository management code and the user agent. The recommended usage for
> Windows is via Samba.
>
> Have I failed to understand something?
> If not, for what reason did you avoid a client/server design?

There are several reasons.

1. The non-reinvention stuff. There are already plenty of replication
   protocols and systems out there. Granted, there are times when a
   networked file system of some kind is not available (disconnected
   operation springs to mind) but a decade after I decided this, there
   are things like CODA happily filling the gap, confirming my original
   decision (in my mind, anyway).

2. Aegis borrows ideas from things like Odin, sun's TFS, Teamnet,
   clearcase, in that in any mature project, you are usually working on
   a miniscule fraction of the files in the repository. Why not USE the
   repository, rather then have redundant copies all over the place,
   slowly going stale? Combine this with the fact that, in order to
   confirm the atomicity requirement, integration does a build, why
   not keep the build results? This, in turn, leads to object files
   (as well as source files) which may be shared (i.e. not recompiled).

Read: VPATH (except the Posix semantics are severly brain dead).

   The client/server model of CVS is entirely inadequate in this instance,
   as it is predicated on a work area model which Aegis simply does not use.
   This in turn leads to producing something horribly like a networked
   file system, not a simple remote checkout/checkin facility; see item 1.

. Of course, if you ``aecp .'' then your work area looks a lot like
   a CVS work area, and you can, at a stretch, work disconnected.
   The gotcha is that the Aegis commands (aeb, aed, etc) won't be able
   to find the Aegis database. So, you need to do manually what Aegis
   would look up and figure out for you in commands like aeb, aet.

3. I only suggested Samba because PC-NFS tends to make serious users
puke. You could use PC-NFS if you really want to, or any other networked
file system accessible by both operating systems.

   The next problem is the (usual) discontinuity between Unix and
   Windows user databases, and thus the inability to rely on Win95's
   "authentication" of a user. I've met WinNT systems which used the
   Unix (well, NIS) user database, and thus the authentications by one
   could be sensably trusted by the other, but this isn't common.

   Coupled with this is that I haven't any expertise in the intimate
   gizzards of the Windows NT security model. I know it *can* impersonate
   users, but I need someone with the right expertise to implement a
   way for Aegis to do so without opening a truck-sized security hole.

4. Use aedist for remoting.

> SCRIPTING
>
> > > Programmability -- it should be possible to embed trigger scripts in a
> > > project. These scripts should run successfully on all platforms.
> >
> > Aegis has this.
>
> I can't find this in the user's guide. Can you provide a pointer to a
> section?

See 4 paras down for the specific "embed trigger scripts" stuff.

The stuff to do with security (availability, integrity, confidentiality)
can't be broken doen into smaller pieces. That is, you can't directly
mess with (add, subtract) the state transitions. [But see below.]

However, all of the commands were designed as command-line driven from
the start. (This, even though, in 1991, X11 was obviously here to stay.)
If you want to skip the review state, for example, simply build your self
a script which does ``aegis -dev-end && aegis -rev-pass'' (you'll need
to set developer_may_review, of course). This is the coarsest and most
obvious scripting.

With the advent of Tk/Tcl, however, it is possible to write GUIs on top
of the command line functionality. See tkaeca for an example. More are
needed, of course.

All of the state transitions have the ability to attach a "notification"
command. See aepa(1) and aepconf(5) for more detail. This permits
interlocks with, for example, a bug tracking system. (Yet another
thing I didn't build into Aegis, on purpose).

There are many places that you can add state transition predicates
(predicates, as opposed to post-ordered triggers). The commonest
predicates, of course, being imposed on the "develop end" transition.
Because of the "successful build" requirement, many validations
can be imposed in the build. E.g. I require Unix end-of-line rather
than DOS end-of-line, and enforce this with a build rule (one that
can't be skipped if aede is to happen). Other validations can be
imposed by hi-jacking the difference or merge or history_put
commands.

There is also a scripted report generator. It is using this report
generator that the aegis.cgi interface is implemented. Huge numbers of
things folks want to do are read-only things which may be implemented
in a similar way, with the operations which need to change Aegis' state
being directed back through existing Aegis command line operations;
this is how aedist works, for example.

Messing with state transitions: You *can* add extra states, by exploiting
the way Aegis does branching. Say you want additional testing performed
by the Formal Test Team. By creating a branch, and having a developer
work on that branch, for a few changes, PLUS the formal test team create
changes which contain only tests, that branch (representing a "big"
change set) may itself be reviewed, must pass its tests, etc, in addition
to reviews performed less formally on the change sets within that branch.
Thus, additional review points and test points have been introduced,
without hacking Aegis' state machine. Build and script above, rather than
within. Because branches are so simple, so cheap and so quick to create,
you can create very short lived branches to encapsulate just a few change
sets into one big change set, with different developers with different
duties or expertise working within the change set (the big one, that is)
until you aede the branch, just like any other change set (really!).

> > > The CM system should have per-branch access control. That is, it should
> > > be possible to have an "official" branch that is world readable but
> > > modifiable only by select people. Simultaneously, Jack Random should
> > > be able to take a given readable branch and create a new working branch
> > > accessable only to Jack Random or his designates.
> >
> > Aegis has this.
>
> Can you provide a pointer into the user's guide or the reference manual?

The aend(1), aerd(1), aenrv(1), aerrv(1), aeni(1), aeri(1), aena(1),
aera(1), aenbr(1), aenbru(1) commands are the ones you want. Read
"project" and "branch" interchangeably.

Regards
Peter Miller E-Mail: millerp@canb.auug.org.au
/\/\* WWW: http://www.canb.auug.org.au/~millerp/
Disclaimer: The opinions expressed here are personal and do not necessarily
reflect the opinion of my employer or the opinions of my colleagues.
Received on Sat Oct 21 14:36:04 2006

This message: [ Message body ]
Next message: B. W. Fitzpatrick: "Hello"
Previous message: Greg Stein: "Re: [subversion-dev] Subversion meeting in San Francisco, 9amWednesday 17th May"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]