Re: RFC: Revision indexes for 1.1

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2004-04-25 20:11:30 CEST

On Sun, 2004-04-25 at 13:53, Branko Èibej wrote:
> So in the final analysis, yes, people won't run into the hard cases very
> often, and when they do, it'll be because they're trying to diff or
> merge unrelated things.

I don't agree. It seems like a reasonable question to ask, "What
changed in this repository between January and February of 2002?" and if
we've given people the rope to have screwed that up in November of 2003
by inserting mis-ordered revisions, we've done the user a disservice.

We have a responsibility to define a semantic model which is simple and
well-defined, not one that we think will just happen to work most of the
time.

> >>If we keep that restriction, there's no way optimize cvs2svn, which
> >>means that people who start with a converted repository will keep
> >>complaining about the size blowup.

> I may have overstated that; it's probably not impossible, but very hard
> because, to create optimal branches and tags from CVS, you have to
> globally optimize the sequence of copies, which means you use up
> enormous amouts of either and/or memory.

(Either what and/or memory?)

I don't think we should be making deep semantic compromises in svn for
the sake of efficiency gains in cvs2svn, and I strongly suspect the
branch optimization problem isn't insurmountable.

> >>>I've gotten the impression that cursor walks create locking issues in
> >>>the BDB implementation.
> >>I can't believe BDB needs more than two lock object to do a linear
> >>cursor walk, unless you do the walk in a transaction. And there's no
> >>need to do that, it being a read-only operation.
> >But there might be write operations mucking with the table at the same
> >time, and they need to do so in a transaction.
> So what? That just means that two identical date queries don't
> necessarily return the same range of revisions, but I don't see that as
> a problem.

Context, context. "So what" meaning "perhaps we'll have locking
problems." I don't really understand what leads to BDB locking
problems; I'm just relying on a statement from CMike that a cursor walk
of the revisions table during a read-only operation has created locking
issues in the past.

> >You've lost me, a bit. Were you proposing that the revision indices
> >would all be btree tables?

> I was proposing that there be one table for all indexed revision props.
> Of course it has to be a brtree table, how else can you use it as an
> index and get any performance benefit? Well, really.

You could use a hash table; the only reason to use a btree table would
be for this date thing.

> >It's true, your revision index feature is difficult (though I think not
> >impossible) to implement within a libsvn_fs_fs design, and since I
> >continue to think that it's of minimal value in general, I'm not very
> >fond of it.

> I can't agree that it is of minimal value. The fact that you can't do
> efficient searches of revprops is a big limitation. Right now the only
> fast index is the revision number, and I see this as a usability
> misfeature because it makes CM tracking and reporting so much harder. It
> may not be a big deal for your average student project, but it's fairly
> major if you want to use SVN to implement any serious quality management
> process.

I feel like we have a fundamental conflict here. Subversion was
originally conceived of as a version control tool, and with its current
feature set we can have an implementation of it which is flexible and
low-overhead. If we want to turn it into Clearcase, we'll lose that
ability, because it will become too unwieldly to index the repository in
all the desired ways without a Oracle-caliber database. Moreover, our
learning curve will suffer as our command set grows to encompass a set
of features most people will never need.

Of course, we could implement an SQL back end and let people build
layered products on top of Subversion which take advantage of whatever
indexing an SQL database can provide. But that's different from
providing core Subversion features aimed at implementing a full-fledged
CM system.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Apr 25 20:11:51 2004

This message: [ Message body ]
Next message: Greg Hudson: "Re: [PATCH] Get-locations - Finished (more or less)"
Previous message: Branko Èibej: "Re: RFC: Revision indexes for 1.1"
In reply to: Branko Èibej: "Re: RFC: Revision indexes for 1.1"
Next in thread: Branko ÄŒibej: "Re: RFC: Revision indexes for 1.1"
Reply: Branko ÄŒibej: "Re: RFC: Revision indexes for 1.1"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]