Some "easy" kinds of distributed Subversion operation

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2002-02-16 07:45:37 CET

This mail is forward-looking, but hopefully still useful. My
apologies if a lot of people already came up with these ideas.

"Distributed operation" as applied to version control can refer to a
lot of different use cases and functionality. It can refer to:

  1. An organization maintaining a private version of software
     developed elsewhere (which CVS handles via "import", though not
     terribly well),

  2. A developer working offline on a disconnected laptop and being
     able to locally check in changes which he later commits to a
     central repository, or

  3. Having a directory in a repository which really comes from
     somewhere else, e.g. "apr" in the Subversion repository could
     come from a different Subversion repository containing apr.

and probably some other scenarios. Each scenario has a lot of
variations depending on what kind of behavior is required by the
users, but here is how I think we can handle the above three use cases
at a basic level without changing our architecture very much:

1 and 2: Local branches
-----------------------

Let's say we have a previously existing central repository at
http://a. (Someone clearly had an "in" at ICANN.) We will set up our
local repository (within the organization, or on the laptop) at
http://b. Now let's look at what we can do:

  * With just straightforward "svn export" and "svn import", we can
    make copies of directories between the repositories, and then use
    "svn merge" to do the required merge work. This is slow and
    space-inefficient, but it would work acceptably, at least for
    smallish projects. We could support "svn cp" between repositories
    ("svn cp http://a/proj http://b/proj") as an abbreviation for this
    kind of export-import combo.

  * If we could "svn merge" with sources which aren't in the same
    repository as the working directory, or in the same repository as
    each other, then we could speed up the above operations. For
    instance:

      svn cp http://a/proj http://b/proj # The only slow step.
      svn cp http://b/proj http://b/proj-base
      svn co http://a/proj /proj # (Syntax questionable.)
      svn co http://b/proj /local/proj
      svn co http://b/proj-base /local/proj-base
      cd /local/proj && edit edit edit && svn ci

      # Sync local branch against central branch.
      svn cp http://b/proj-base http://b/proj-sync
      cd /local/proj-sync && svn merge http://a/proj && svn ci
      cd /local/proj && svn merge http://b/proj-base http://b/proj-sync
        && resolve conflicts && svn ci
      svn mv http://b/proj-sync http://b/proj/base

      # OR, Sync central branch against local branch.
      cd /proj && svn merge http://b/proj-base http://b/proj
        && resolve conflicts && svn ci

  * If we had merge history which was flexible enough to deal with
    remembering merge sources from different repositories, then we
    could eliminate a lot of the bookkeeping in the above example.

What we should be looking towards, then, is:

* The ability to "svn cp" across repositories, for convenience.

* The ability to "svn merge" without assuming that the merge sources
are within the same repository.

  * Store merge history information in the target (where we know we
    have write access) and allow source information to be from
    different repository.

What we don't get from this approach is the ability to propagate
commits upstream or downstream without losing track of the changeset
divisions and log messages, as Bitkeeper allows. So we'd be clearly
inferior for use case (2) without some kind of specialized operation
along those lines. ("svn up" or "svn down", anyone? Actually, we if
"svn up" is currently an alias for "svn update", we might consider
killing that alias in order to reserve the command name.)

3: Directories from other places
--------------------------------

Because Subversion has a single repository version, I don't know if it
really makes sense for a repository to be going out and fetching
information from other sources on checkout (an idea Branko called
"computed nodes," I think). A better and simpler approach might be a
way to store mixed working directory setup information in a
repository. For instance, a directory (presumably empty) could have a
property "svn:redirect" with gives the real URL to check out that
directory from. The same could even apply to files, although it's not
clear when that would be useful.

Redirect properties could even point a client at a CVS repository,
assuming the client is configured to allow it and has a convenient
stash of CVS tools to refer to.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:08 2006

This message: [ Message body ]
Next message: Joe Orton: "Re: svn 0.9 rc2 tarball"
Previous message: Alex Holst: "Re: svn 0.9 rc2 tarball"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]