[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

revnum considered harmful

From: Tom Lord <lord_at_regexps.com>
Date: 2002-12-16 10:51:45 CET

I think I see a flaw in the semantic design of svn (revnums) that I
believe is likely to impose a serious limit on performance in the
future, when people try to scale svn for large but realistic
situations:

revnum imposes a total order on all write transactions.

If I'm reading the code correctly, the global revision number of a
write transaction is determined early in the transaction -- before
most of the work is done (I think the ra_svn protocol calls this
step `target_rvn').

Let's suppose that we have two concurrent write transactions. One of
these precedes (by revnum) the other and which of the two that is is
determined before the scope of either transaction is known.

Let's suppose that the first of two concurrent transactions is a
large transaction, the second a short transaction.

The short transaction can not reasonably complete until the long
transaction completes. Consider:

        client A long txn revision N
        client B short txn revision N+1

        Timeline:

        0... A starts ..... B starts ..... B ready to finish ..... ->

        If B completes with A still running, client B expects to be
        able to query the repository at revision N+1. But the part of
        the repository written by A is not yet known because it is not
        yet known whether A will succeed or fail. While other clients
        might be limited to seeing only N-1 until the fate of A is
        known, client B expects to see the result of its own write.
        (This invites an alternative conclusion that while B can
        complete early, certain subsequent reads concurrent with A
        must be delayed -- but that alternative doesn't substantially
        change the conclusions reached in this message.)
        

        If B completes, aborting A, then time already spent on A
        is needlessly wasted.

In general, the time of completion of each of a set of concurrent
writes is forced, forever, by the revnum design, to be the time of
completion of the latest-finishing transaction in that set.

Consider a shop with O(100) developers and a busy phase of
development: lots of checkins over just a few days. Some checkins
modify just a few files, others modify hundreds or thousands.

Because of revnum, whenever someone begins to checkin a large
modification, the repository "freezes up" for all other committers
until that large checkin completes.

This is by no means a _necessary_ state of affairs. Consider this
usage scenario: we have a repository that holds multiple projects and
branches. Our two transactions each modify a different project or
branch. Thus, in terms of the revision control data that users care
about, the ordering of these two transactions is immaterial because
they modify disjoint sets of data. If we must regard them as
well-ordered, then we may as well let the order be determined by which
transaction _completes_ first, not which one starts first.

It would seem natural (at least under some conditions which can be
easily detected by the server) to let the short transaction complete
before the longer -- on the reasonable presumption that the longer
transaction will not overlap that data. But no -- the revnum semantic
will not allow that because the server does not know in advance that
the two transactions do not overlap.

Several solutions are possible.

One is to modify the protocols so that the scope of each transaction
is declared very early. This would reduce the impact of the problem,
but not eliminate it, as even declaring the scope of a large
transaction may turn out to be expensive.

Another is to modify the protocols so that the revnum of a write is
determined only when the transaction completes. This would also
reduce the problem, although it still will impose a bottleneck that
restricts the potential benefits of dividing a single logical
repository across multiple servers (consider a large code shop, or an
"exotic" application such as a large wiki).

Both of those imperfect solutions are likely to have a large impact on
the protocols, the server, and clients. If large impacts are
unavoidable anyway, one might as well consider a third, clean
solution: eliminate revnum entirely.

A strong case can be made that the primary four virtues of svn
repository semantics and presumed performance are:

        1. Space and time efficient tree cloning.

        2. Update methods that facilitate server-side delta-compressed
           storage.

        3. Change-based update and access methods that reduce network
           traffic.

        4. Transactions permitting multi-file atomic updates.

and, furthermore, a strong case can be made that those virtuous
capabilities are, in and of themselves, quite sufficient to implement
revision control (with either or both arch-like or cvs-like user
interfaces). Given those four capabilities, a global revnum is not
needed at all (and an alternative has already been presented).

Currently, it is primarily code that is part of svn itself that would
be impacted by elimination of revnum. QOTD slashdot (paraphrasing):
"If you can't find the time to do it right in the first place, how are
you going to find the time to fix it later?" :-)

I tend to believe that a great deal of care and effort has gone into
the low-level implementation of repositories (the fs layer?). That
work and expertese can, I think, be quickly leveraged to implement the
"four virtues" approach. Similarly, a great deal of care and effort
has gone into a UI layer with a CVS feel -- that too can be quickly
leveraged.

It would be a mistake for you to believe I care what color you paint
your bikeshed,
-t

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Dec 16 10:40:11 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.