[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Why rewrite the Subversion working copy?

From: Marc Strapetz <marc.strapetz_at_syntevo.com>
Date: Wed, 23 Apr 2008 12:21:33 +0200

> My opinion? The Subversion project shouldn't spend any more time
> trying to make Subversion a better version control tool for non-huge
> open source projects. Subversion is already decent for that task, and
> other tools have greater potential than it. We need to focus on
> making Subversion the best tool for organizations whose users need to
> interact with repositories in complex ways, like:

To me that sounds too pessimistic -- does the Subversion project itself
would meet these conditions?

Subversion should focus on being the best tool for all "centralized"
environments: where there are servers and clients and a reasonable
amount of bandwidth. Compared to decentralized VCS, it should definitely
be better here also for non-huge repositories and also for the most
basic repository interactions.

So back to the working copy, from my personal experience improving the
working copy performance could be important here (especially
wc-ng-design suggestions regarding an alternate storage to the current
.svn directories sound promising).

--
Best regards,
Marc Strapetz
_____________
SyntEvo GmbH
www.syntevo.com
David Glasser wrote:
> Why rewrite the Subversion working copy?
> 
> It's generally agreed by now that libsvn_wc is the most painful part
> of Subversion to deal with.  Reasons for this, and the start of a
> design for a rewrite, are enumerated by Erik Huelsmann and others in
> notes/wc-ng-design.
> 
> But this is going to be a lot of work!  Is it really worth it?
> 
> It's been over four years since Subversion 1.0, and over seven years
> since the first milestone release.  Subversion showed the world that
> there could be open source version control software that was better
> than CVS; since then, there has been an explosion of new open source
> version control systems.
> 
> So when it comes time to consider sinking a large amount of time and
> effort into rewriting the hairiest part of Subversion, a natural
> question is: why bother?  Why put the effort into improving Subversion
> instead of working on git, Mercurial, or <insert your favorite new VCS
> here>?
> 
> Now, Subversion has many obvious advantages over, say, git:
> 
> * more user-friendly UI (except all the weirdnesses that come from the
>   .svn-in-each-directory decision)
> * commitment to portability (specifically, including Windows)
> * library design with a stable and documented API for integration into
>   other systems
> 
> But, you know, fixing these problems in git might not be much harder
> than rewriting the Subversion working copy.  And by now, I'm pretty
> confident that, *in situations where it's feasible*, the "having all
> of the historical data on the client" model is superior to the
> Subversion client/server model.  We spend so much time in Subversion
> development thinking about "roundtrips" and "putting load on the
> server" and so on, which isn't even a consideration for a system like
> git.
> 
> But the key phrase there is "in situations where it's feasible".
> 
> I'm pretty confident that, for a new open source project of non-huge
> size, I would not choose Subversion to host it, at least not for
> reasons directly associated with the version control system itself
> (eg, I might choose Subversion because I like Google Code Project
> Hosting; github looks like it might be good competition eventually,
> though).
> 
> So does that mean Subversion is dead?  That we should all jump ship
> and just write a new front-end for git and make sure it runs on
> windows?
> 
> Nah.  Centralized version control is still good for some things:
> 
> * Working on huge projects where putting all of the *current* source
>   code on everyone's machine is infeasible, let alone complete
>   history (but where atomic commits across arbitrary pieces of the
>   project are required).
> * Read authorization!  A client/server model is pretty key if you
>   just plain aren't allowed to give everyone all the data.  (Sure,
>   there are theoretical ways to do read authorization in distributed
>   systems, but they aren't that easy.)
> 
> My opinion?  The Subversion project shouldn't spend any more time
> trying to make Subversion a better version control tool for non-huge
> open source projects.  Subversion is already decent for that task, and
> other tools have greater potential than it.  We need to focus on
> making Subversion the best tool for organizations whose users need to
> interact with repositories in complex ways, like:
> 
> * Working on enormous repositories, where you don't want to check out
>   the entire project
> 
>   - checkouts below the branch root: we have that!
>   - sparse directories: we mostly have that!
> 
> * Working on repositories with enormous files, where you don't want an
>   extra "base" copy of every file if you're only editing a few at a
>   time
> 
>   - baseless wcs: we don't have that yet, but it should be easy in
>     wc-ng
> 
> * Workspaces where different parts come from different branches
> 
>   - switching: we have that, but it's really easy to pass the wrong
>     URL to "svn switch" and break your working copy
> 
> * Workspaces containing trees from different repositories, or from
>   different parts of the same repository nested inside each other
> 
>   - externals: we have it, but it's a tacked-on wart of a feature with
>     shoddy semantics, a shoddy UI, and big limitations
> 
> * Workspaces containing multiple parts of one repository, side by
>   side, which should be committed together atomically
> 
>   - this sort of works, sometimes, if you happen to hit one of the
>     codepaths that doesn't try to find a common parent of the commit
>     targets, or you do awful hacks like putting a fake ".svn"
>     directory in the parent directory
> 
> A great deal of the power of tools like git comes from their ability
> to assume that situations like the above aren't worth dealing with.
> (And for lots of projects, they're absolutely correct!  We don't need
> version control systems to be one-size-fits-all.)
> 
> The general Subversion architecture *should* be able to deal with all
> of the above use cases.  Some of them are already achieved, to some
> degree or another.  None of them should require any server-side
> changes at all.  They all could theoretically be achieved without a
> full wc rewrite (see Blair's message today about "file externals"),
> but dealing with the current wc is full of pain.
> 
> I'd like to put time and energy into fixing the working copy
> situation.  But I don't want to fix it just in order to make easy
> things still easy.  I want to fix it to make hard things feasible.
> 
> There are many levels that need to be designed.  Erik (and others)
> have done an excellent job in the notes/wc-ng-design file of analyzing
> what I think of as the "middle" layer: the layer most similar to the
> current libsvn_wc.  I've been doing a lot of thinking about the lower
> and higher layers.
> 
> For the lowest layer, I've started implementing a prototype in Python
> of a low-level Subversion metadata store, with full unit tests and all
> that jazz.  Currently I've designed the API for a refcounted blobstore
> and am working on tree and treestore abstractions.  I'll put it
> somewhere (http://svn.collab.net/repos/svn/experimental/svnws?  gvn
> repository?  whatever) sometime soon, once I've got a little more
> implemented.  The key goal here is to create a *non-brittle* working
> copy, where we don't have to be scared that typing "svn switch" with
> the wrong URL will corrupt the working copy irretrievably.  Efficiency
> is nice too.  (Hopefully, while the code itself will probably need to
> be backported to C, the tests might end up being executable against
> the "real" code.)
> 
> For the higher layer, I've been thinking about the design of
> "libsvn_workspace" and an "svn workspace" command, which allows users
> to define non-trivial working copy layouts, mapping one or more
> repository subtrees into their workspace.  "switched directories" and
> "externals" wouldn't be special cases any more: they're just be normal
> things that show up in workspaces that aren't just a single repository
> subtree.  Projects can be configured ad hoc using "svn workspace"
> commands, or in a version-controlled file (or property?), similar to
> svn:externals or the SVK svk:project:* property.  Perhaps I (we?) can
> add this layer to the prototype I'm starting, before backporting it to
> C for the real svn_ws.
> 
> So yeah, this is some of what I've been thinking.
> 
> --dave
> 
> 
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-04-23 12:22:03 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.