[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Subversion, decentralized version control, and the future.

From: John Szakmeister <john_at_szakmeister.net>
Date: 2007-07-06 20:37:57 CEST

Sorry if this is a repeat, but my first post didn't seem to make it through.

Karl Fogel wrote:
> I've been wanting to post this for a while, but was waiting for the
> dust from Linus Torvald's GIT talk to settle first (for those who
> haven't seen it: http://www.youtube.com/watch?v=4XpnKHJAok8). Eric
> Raymond's recent post thanking the Subversion team gives me the excuse
> I needed to finally sit down and write this :-). (Eric's post is at
> http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=128106.)
>
> In his talk, Torvalds explained why he thinks decentralized version
> control systems (like GIT and Mercurial) are the way of the future,
> and why he thinks Subversion got it all wrong. I think that's a
> misanalysis, and will describe why below. Unfortunately, Torvalds
> also indulged in a childish presentation style that distracted from
> his useful technical criticisms of Subversion. Since I'd like to use
> some of his arguments as a jumping-off point for thoughts on
> Subversion's future, here they are in brief:
>
> * Optimizing merging is as important as optimizing branching (if
> not more so).
>
> * Speed matters: when a common operation goes from thirty seconds
> to half a second, that changes the whole way you work.
>
> * Having all history locally (or at least as much history as you
> need for a given operation) is useful.
>
> * Reducing the thickness of the "commit access" wall is good for
> development. Torvalds didn't make this argument terribly well,
> so I'll try to restate what I think was his point:
>
> The important question is, who can put changes into the
> repository that the project is publishing releases from? This
> should not be confused with commit access in the technical sense.
> Instead, think of it this way: committing is just a way to
> connect changes to other changes, and you shouldn't need my trust
> in order to connect your changes to anything you want to connect
> them to. The real question is, when and how do I include your
> changes in my release? So the issue isn't commit access, it's
> having trust networks and convenient methods of change selection.
>
> When I talked to Brian Fitzpatrick about this, he listed three things
> as top priorities:
>
> - Faster. Subversion does need to be faster for many ops.
> - Offline commits.
> - Local branches.

I'm not sure what "local branches" are... but if it's an opportunity to
try a few things (versioned) locally, and then submit the result to the
repository, then a huge +1 to that.

> I would add "better merging", but basically agree with Fitz (note that
> we're getting much-improved merging in Subversion 1.5).

A big +1 on better merging. We're definitely getting better merge
support in 1.5 (and thanks to all involved for that!). However, as
dberlin pointed out in a separate thread, there are some cases that
would be tough to solve without a changeset DAG.

And while we're at it, the number one complaint that I hear over, and
over again: not being able to query the complete history of a file
efficiently. I get emails and calls from customers and other
individuals all the time about this (I don't do Subversion support, but
my customers know that I'm involved in the community). They desperately
want that ability. Hidden under this, they also want branches and tags
to be first class citizens, and have revision aliases.

[snip]
> A general tool configured to behave in a specific way is never quite
> as natural to use as a tool designed for that specific use in the
> first place. In other words, Subversion can -- will have to -- take
> on some of the features of decentralized VC systems, but it will never
> be as good a decentralized system as they are. By the same token, a
> decentralized system can be configured to work like a centralized one,
> but will never be as good at it as Subversion is. The trick for us is
> to keep the centralization feature without some of the limitations
> that have traditionally come with centralization.

Argeed. I'd definitely like to see some features of a decentralized VC
incorporated, but not if it means overly complicating the user interface.

> Concretely, what does this mean?
>
> One of Subversion's flaws (mea culpa) is that we didn't realize the
> usefulness of having symmetrical functionality on the client and
> server sides. The working copy should really be a repository, even if
> it's not always going to store all the history available on the server
> side (with some projects, you really can't, it's too big).

...and we definitely don't want to download an entire repository that
hosts multiple (potentially large) projects.

> So we're going to need a working copy rewrite. We knew that; in fact
> we've talked about rewriting the repository to use something like
> Mercurial's revlog format, for various reasons, and about using that
> kind of repository for working copies as well.

One big issue that I see here is that the repository was designed not to
forget. So we'll need to really think about how to manage the working
copy area, without it growing to large... and more importantly, without
user intervention.

I've played with SVK, and one of the big disappointments was that when I
was done with a project, I still had all of that projects history on my
computer. I had no way to remove it without affecting other mirrored
projects. :-(

> We also have to be faster. Fortunately, we've pretty much agreed,
> IIRC, that we're willing to punt on subdirectory detachability in
> working copies in order to get performance improvements.

+100! I think this has been standing in the way of other things, like
rename tracking, as well.

> And now I'm going to hand-wave on a lot of details. I don't mean to
> start the Subversion 2.0 design thread now, just to offer some
> thoughts on general goals. We don't need to let labels guide our
> thinking ("We are a centralized system!" / "We are a decentralized
> system!"). We do need to recognize that users are not interested in
> becoming version control experts, and we need to pay close attention
> to what they actually want, as opposed to what experts might want them
> to want.
>
> Case in point: what's the most popular feature added to Subversion
> after the 1.0 release? Probably file locking (the ultimate
> centralized feature, by the way). Yes, the heavy-duty developers wish
> for better merging, and I don't blame them. But from watching users@
> and irc.freenode.net/#svn, talking to companies that do Subversion
> support, and from doing some Subversion consulting myself, I think
> locking was actually a more important feature. (Of course, we have it
> already, so that doesn't change anything about Subversion's future,
> I'm just making a point about what's important to users.)

Agreed. Linus's standpoint was not to put anything over 10MB into the
repository (IIRC), and he completely neglected to address merge
resolution of binary files in a decentralized environment.
Unfortunately, in the real world, we have to deal with these things.
For instance, we do a lot of hardware work at my company. We need
firmware images from 3rd parties, schematic captures of the design,
binary test inputs, disk images, drawings, documentation, and
specialized tools in the repository. We've got several repositories
10's of gigabytes in size, because of all the binary data surrounding
it. Subversion's locking helped people to prevent wasting time, and
that's extremely important in our fast-paced environment.

> Subversion's phenomenal adoption rate (*) isn't due to being the only
> game in town. We never were, if you count the proprietary systems,
> and we're even less so now that the open source version control world
> has become so fertile. The reason Subversion is taking over the world
> is because it is tremendously user-focused, and because it provides
> well-documented APIs that enable other developers to write software on
> top of Subversion. We should copy what we need from the decentralized
> systems, but remember that most users don't know or care whether a
> system is centralized or decentralized -- their ideal system is one
> they don't notice. Let's keep our eye on the ball, so they don't have
> to.

Agreed. A number of customers that I work with chose Subversion
primarily because it was easy to get up and running compared to other
alternatives. They recognized that the couldn't gain traction in their
organizations unless the tool was easy... and they were right.

-John

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Jul 6 20:37:56 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.