[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: First impressions...

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2002-01-25 18:41:31 CET

I think I agree with you on most of your negatives, but none of them are
likely to change any time soon. Specific comments:

On Fri, 2002-01-25 at 11:33, Eric M. Hopper wrote:
> Using Berkeley DB extensively is a bad idea. Plain text files much more
> readable. Keep seperate revisions in seperate files or seperate
> directories. Take advantage of the filesystem as a database. Using
> Berkeley DB essentially creates an extra superfluous namespace. Try
> reading reiserfs naming system docuement
> (http://www.namesys.com/whitepaper.html) to understand why this is a bad
> idea.

I agree, and I think we're starting right now to see some added
difficulty in debugging Subversion problems because we're using db.
(It's hard to know for sure without going back in time, doing it without
db, and seeing if these problems became easier to fix.)

I've thought about how to do a direct mapping to the filesystem, and I
came up with a fairly simply idea: if REPOS is the path to the
repository, then REPOS/1 is the first revision, REPOS/2 is the second
revision, and so on. Inside a revision, you can have a relative symlink
back to a previous revision for an unchanged file or directory. Files
can contain raw contents or diffs against some other path. When you're
doing a commit, you build up a new rev in REPOS/some-id-string, and when
you're ready to finalize the commit you merge it with any commits which
have been made in the meantime and then rename it to REPOS/3 or whatever
the next number is.

Needs some fleshing out (I haven't mentioned properties, and you need
some concept of relatedness in order to produce efficient diffs for the
client), but... I wish I had this idea way back before the team decided
to use Berkeley DB and implemented it that way. :)

> Having a single tree version number is bad too. It becomes harder to
> get an intuitive sense of changes to individual files. It's also hard
> to create merged trees this way. Especially if tags just refer to a
> partiular tree version number.

I'm undecided on the single tree version. I think it puts us in a bad
situation when it comes to distributed repositories or repository data
which comes from other sources. On the other hand, it has some very
nice properties; commit atomicity comes for free, for instance.

Anyway, I'm not sure you've learned yet how Subversion does tags and
branches. Subversion just provides cheap directory copies; it doesn't
have a namespace for tags and branches separate from the regular
repository namespace.

> I think having structureless version numbers may be a bad idea. It's
> useful to know, at a glance, which versions belong to which branch, and
> to be able to see the branch history of a file in its version history.
> Taking information out of version number just removes it from clear view
> and makes you have to look somewhere else for same info. Just because
> CVS had structured version numbers as a side effect of it's kludgey
> implementation in terms of RCS doesn't mean they're evil.

I think the Subversion method of doing branches and tags is... bold, but
worth experimenting with. It actually flows out of your arguments about
having a single namespace.

> Controlling subversion behavior based on MIME types may be a bad idea.

I'm a little nervous about that myself, but I think it's one of the
easiest things to change, so I'm not too nervous.

> WebDAV? Why? Seems like adding a useless layer to me.

I don't like it either. The application protocol world has been pulled
in two directions recently: one contingent wants to design the simplest
possible protocols from a birds-eye view, and the other contingent wants
to reuse protocol components--even if they are bulky, flawed, and full
of unneeded features--in order to reduce development time, get some
forms of compatibility between protocols, and in some cases to get
through firewalls. (Although, never bring up that last argument in an
IETF context; the IESG believes, probably correctly, that it's unethical
to deliberately subvert firewall policies by running a new protocol on
the same port as an existing protocol.)

I'm in the first camp, but it's hard to counter the arguments of the
second camp, and in my experience they generally win any time a system
is designed by a heterogeneous committee.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:59 2006

This is an archived mail posted to the Subversion Dev mailing list.