> You've misunderstood the code (or ghudson's ra_svn protocol
> is broken, which I highly doubt).
I think the confusing bit is that the set-target-rev editor
function is used for updates and similar operations, not for
commits.
I was confused by reading and misinterpreting the `protocol' file in
the ra_svn directory and the description of the database schema in the
`fs' directory.
An admittedly quick read through the schema document made it seem that
pending transactions are recorded in the database and that that record
includes a transaction number -- which implies the txn number is
assigned early.
The confusion was reinforced by discussion on this list about certain
usage errors / bugs(?). Specifically, it seemed to me that early in
the transaction, a commit examines the revnum of the repository to
make sure that the wd is up-to-date wrt that revnum, and refuses to
proceed if it is not. That too, implies that the client (effectively)
knows its new revnum early in the txn. (I suppose now, in retrospect,
that the commit is not looking at the global revnum, but only at the
last revnum at which files being committed previously changed.)
I think there are still two problems with revnum: (a) a (much
reduced) performance limitation; (b) a semantic problem from the
source mgt. perspective.
(a) the (much reduced) performance limitation:
While assigning revnum late is far better than assigning it early, the
existence of revnum _still_ limits server scalability (though in a
less serious way). In particular, if a single repository is
implemented over a distributed database, all of the participating
servers must still synchronize for every transaction in order to
allocate txn numbers -- you'll still have either a single thread of
execution or a distributed commit protocol through which all commits
must pass.
With no revnum, concurrent, non-overlapping txns can be unordered --
for example, using a distributed database, synchronization for a set of
such transactions can be coallesced (reducing the total number of
syncs) and can take place asynchronously wrt to the txns themselves
(e.g., well after they have completed and clients have moved on).
Realistically (imo), _this_ performance problem can only ever really
be important for utterly huge transaction rates.
(b) the source mgt problem:
Revnum is harmful for another reason that has nothing to do with
concurrency.
If I'm reading the FAQ correctly ( :-), revnum is, in essense, an
implementation detail -- it is "mostly hidden" from users for revision
control purposes.
Yet within one repository, merge history is expressed wrt. revnum.
The emerging plan for distributed revision control seems to be aiming
at recording merge history as <guid,revnum> pairs.
Thus, the plan for merge history keeps track of history in low level
terms that officially have no high-level rev ctl meaning.
To understand why that's problematic, it's helpful to consider that
merge history is not only the underlying support for "smart merging"
-- it's also a record of reference that human's want to be able to
read. It should be expressed in higher level terms.
This gets into smart changeset management. For example, in a single line
of development one would ideally like human-cosumable names for each
revision, and (at least in the branches critical to a large
development effort), to regard each revision as a particular,
purposeful changset. A query about the revisions for project `foo'
might generate a list like:
foo-rev1 added feature xyzzy
foo-rev2 added feature quux
foo-rev3 fixed bug #1234
....
When two related lines are merged or partialy merged, those changesets
are the ideal "unit of merging". One might ask "on my branch, what's
been merged in from the foo mainline?" and get:
foobranch-rev1
foobranch-rev3
or ask "what's missing from foo?" and get:
foobranch-rev2
and then, the human reader knows: "The feature `quux' has not been
merged into foobranch". And the humans have friendly names for the
changesets in question.
Moreover, by giving revisions more meaningful, less
repository-specific names like this, it becomes practical to
put the tar bundle:
foo-rev2-patch.tar.gz
on your site, let people merge that with a `patch'-like tool, and have
the effect be the same as if they'd done an operation between
repositories.
It also becomes possible to have "smart merging" technology not be
specific to any particular rev ctl system -- but to instead have
systems be interoperable in this regard. I can have a branch in my
svn repository of a line in your arch repository and smart merge
between those.
So, I think that both the intra-repository and global revision names
for merging purposes should not be based on revnum, but on an
independent, higher-level namespace.
-t
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Dec 16 22:17:59 2002