[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: revnum considered harmful

From: Tom Lord <lord_at_regexps.com>
Date: 2002-12-16 13:43:40 CET

       Isn't all this just a special case of a larger issue - namely
       that with two transactions running that *may* affect each
       other, one of them *has* to wait for the other to complete (or
       for the system to determine that they do not overlap)?

It's hard to answer that query consisely.

Yes, this is an instance of a well known, general problem from
database theory: Avoid global txn sequence numbers or equivalent
limits on txn concurrency. It's a fundamental design blunder every
time.

No, this isn't an insurmountable problem -- not by any means -- and I
proposed the general form of a solution in the message you're replying
to ("four virtues"), and many details of a specific solution in my
FSDB message a day or so ago.

Consider not svn specifically, but an FSDB in general. Two write
transactions modify different parts of the tree. Do we have to know
in what order those two transactions occur? Not in general, no.
A concurrent read that spans the region of both writes can force us to
choose a particular order, but there's kind of a quantum mechanics
effect: if nobody's looking (in ways that matter), then the two
transactions don't have to be ordered.

The ambiguity about txn order is important: it enables lots of
important optimizations. That's true not only for svn, but for
databases generally.

You suggested:

          (1) Prioritize the smaller transactions, letting the larger
          transactions require a re-try (or simply fail) in case of a
          conflict.

          (2) Prioritize the larger transactions, letting the smaller
          transactions require a re-try (or simply fail) in case of a
          conflict.

          (3) Finish transactions on a first-come first-serverd basis.

but left out (4): Use fine-grained locking and/or heuristics based on
partial information about the txns-so-far to decide in what order to
prioritize the two. This is quite plausible in the case at hand. You
have to get past the notion of writes propogating all the way up to
new revisions of the / directory, though -- that might be hard if you
are stuck in the current svn mindset. Cheap tree cloning -- good.
Every txn vsns / -- bad.

I may as well point to arch, in which locks are
per-line-of-development. Although arch is not an FSDB, the idea of
per-line locking does map to an FSDB in a natural way.

And, just for fun (cause it's a fun read): There's a neat (~30 year
old) paper by Leslie Lamport relating concepts from special relativity
to synchronizing concurrent threads. It's not directly about
databases, but the concepts and approach developed there are helpful
here. Sorry, though, I don't have the precise reference handy (I
think it was in "Communications of the ACM" -- I seem to remember an
orange cover.)

         My apologies if I've totally missed the point / am talking
         out of my ass.

Harldy. This stuff is hella tricky, IMO. It took, like, 10 or so
revisions of my message before it was close-enough-to-accurate to
consider sending. Even so, I'm sure a practiced nitpicker could find
enough little flubs to totally distract attention away from the deep
content. Such is life.

-t

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Dec 16 13:32:08 2002

This is an archived mail posted to the Subversion Dev mailing list.