Re: Why rewrite the Subversion working copy?
From: Marc Strapetz <marc.strapetz_at_syntevo.com>
Date: Wed, 23 Apr 2008 12:21:33 +0200
> My opinion? The Subversion project shouldn't spend any more time
To me that sounds too pessimistic -- does the Subversion project itself
Subversion should focus on being the best tool for all "centralized"
So back to the working copy, from my personal experience improving the
-- Best regards, Marc Strapetz _____________ SyntEvo GmbH www.syntevo.com David Glasser wrote: > Why rewrite the Subversion working copy? > > It's generally agreed by now that libsvn_wc is the most painful part > of Subversion to deal with. Reasons for this, and the start of a > design for a rewrite, are enumerated by Erik Huelsmann and others in > notes/wc-ng-design. > > But this is going to be a lot of work! Is it really worth it? > > It's been over four years since Subversion 1.0, and over seven years > since the first milestone release. Subversion showed the world that > there could be open source version control software that was better > than CVS; since then, there has been an explosion of new open source > version control systems. > > So when it comes time to consider sinking a large amount of time and > effort into rewriting the hairiest part of Subversion, a natural > question is: why bother? Why put the effort into improving Subversion > instead of working on git, Mercurial, or <insert your favorite new VCS > here>? > > Now, Subversion has many obvious advantages over, say, git: > > * more user-friendly UI (except all the weirdnesses that come from the > .svn-in-each-directory decision) > * commitment to portability (specifically, including Windows) > * library design with a stable and documented API for integration into > other systems > > But, you know, fixing these problems in git might not be much harder > than rewriting the Subversion working copy. And by now, I'm pretty > confident that, *in situations where it's feasible*, the "having all > of the historical data on the client" model is superior to the > Subversion client/server model. We spend so much time in Subversion > development thinking about "roundtrips" and "putting load on the > server" and so on, which isn't even a consideration for a system like > git. > > But the key phrase there is "in situations where it's feasible". > > I'm pretty confident that, for a new open source project of non-huge > size, I would not choose Subversion to host it, at least not for > reasons directly associated with the version control system itself > (eg, I might choose Subversion because I like Google Code Project > Hosting; github looks like it might be good competition eventually, > though). > > So does that mean Subversion is dead? That we should all jump ship > and just write a new front-end for git and make sure it runs on > windows? > > Nah. Centralized version control is still good for some things: > > * Working on huge projects where putting all of the *current* source > code on everyone's machine is infeasible, let alone complete > history (but where atomic commits across arbitrary pieces of the > project are required). > * Read authorization! A client/server model is pretty key if you > just plain aren't allowed to give everyone all the data. (Sure, > there are theoretical ways to do read authorization in distributed > systems, but they aren't that easy.) > > My opinion? The Subversion project shouldn't spend any more time > trying to make Subversion a better version control tool for non-huge > open source projects. Subversion is already decent for that task, and > other tools have greater potential than it. We need to focus on > making Subversion the best tool for organizations whose users need to > interact with repositories in complex ways, like: > > * Working on enormous repositories, where you don't want to check out > the entire project > > - checkouts below the branch root: we have that! > - sparse directories: we mostly have that! > > * Working on repositories with enormous files, where you don't want an > extra "base" copy of every file if you're only editing a few at a > time > > - baseless wcs: we don't have that yet, but it should be easy in > wc-ng > > * Workspaces where different parts come from different branches > > - switching: we have that, but it's really easy to pass the wrong > URL to "svn switch" and break your working copy > > * Workspaces containing trees from different repositories, or from > different parts of the same repository nested inside each other > > - externals: we have it, but it's a tacked-on wart of a feature with > shoddy semantics, a shoddy UI, and big limitations > > * Workspaces containing multiple parts of one repository, side by > side, which should be committed together atomically > > - this sort of works, sometimes, if you happen to hit one of the > codepaths that doesn't try to find a common parent of the commit > targets, or you do awful hacks like putting a fake ".svn" > directory in the parent directory > > A great deal of the power of tools like git comes from their ability > to assume that situations like the above aren't worth dealing with. > (And for lots of projects, they're absolutely correct! We don't need > version control systems to be one-size-fits-all.) > > The general Subversion architecture *should* be able to deal with all > of the above use cases. Some of them are already achieved, to some > degree or another. None of them should require any server-side > changes at all. They all could theoretically be achieved without a > full wc rewrite (see Blair's message today about "file externals"), > but dealing with the current wc is full of pain. > > I'd like to put time and energy into fixing the working copy > situation. But I don't want to fix it just in order to make easy > things still easy. I want to fix it to make hard things feasible. > > There are many levels that need to be designed. Erik (and others) > have done an excellent job in the notes/wc-ng-design file of analyzing > what I think of as the "middle" layer: the layer most similar to the > current libsvn_wc. I've been doing a lot of thinking about the lower > and higher layers. > > For the lowest layer, I've started implementing a prototype in Python > of a low-level Subversion metadata store, with full unit tests and all > that jazz. Currently I've designed the API for a refcounted blobstore > and am working on tree and treestore abstractions. I'll put it > somewhere (http://svn.collab.net/repos/svn/experimental/svnws? gvn > repository? whatever) sometime soon, once I've got a little more > implemented. The key goal here is to create a *non-brittle* working > copy, where we don't have to be scared that typing "svn switch" with > the wrong URL will corrupt the working copy irretrievably. Efficiency > is nice too. (Hopefully, while the code itself will probably need to > be backported to C, the tests might end up being executable against > the "real" code.) > > For the higher layer, I've been thinking about the design of > "libsvn_workspace" and an "svn workspace" command, which allows users > to define non-trivial working copy layouts, mapping one or more > repository subtrees into their workspace. "switched directories" and > "externals" wouldn't be special cases any more: they're just be normal > things that show up in workspaces that aren't just a single repository > subtree. Projects can be configured ad hoc using "svn workspace" > commands, or in a version-controlled file (or property?), similar to > svn:externals or the SVK svk:project:* property. Perhaps I (we?) can > add this layer to the prototype I'm starting, before backporting it to > C for the real svn_ws. > > So yeah, this is some of what I've been thinking. > > --dave > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org For additional commands, e-mail: dev-help_at_subversion.tigris.orgReceived on 2008-04-23 12:22:03 CEST |
This is an archived mail posted to the Subversion Dev mailing list.
This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.