TimeSafe Subversion

From: Branko ÄŒibej <brane_at_xbc.nu>
Date: 2005-07-25 17:57:10 CEST

C. Michael Pilato wrote:

>Greg Hudson <ghudson@MIT.EDU> writes:
>
>
>
>>(Rev-prop changes are not intrinsically logged right now, and must be
>>logged using hook scripts. But they should have history! Here's an
>>outline proposal of how to do it:
>>
>> Instead of modelling rev-props as a name -> value mapping, model
>> them as a name -> list of (value, author, date) tuples. Figure
>> out how to store the new model in a manner compatible with the
>> old model. Provide access functions which access the entire
>> list for a rev-prop; the existing access function will of course
>> just return the value of the most recent tuple.
>>
>>With that in place, it becomes much essentially unimportant to log write
>>operations, although once you have an access log it's natural to throw
>>them in.)
>>
>>
>
>I've been thinking about something similar for some time now. Here's
>a straw-man BDB implementation proposal:
>
> Today, an transaction property list is stored like so:
>
> (NAME VALUE ...)
>
> Note that both NAME and VALUE are skel atoms.
>
> We could tweak the code to recognize a new definition as well:
>
> (NAME () ...)
>
> We could tweak the code to recognize that if, in fact, a VALUE is a
> list instead of an atom, something special
>
>
> Today, an empty transaction property list is stored as just that,
> an empty list skel. We could tweak the code to recognize that if
> instead of a list there is a 0-length atom, this special value
> means, "go read the txnprops table." That txnprops table would be
> a BDB dupkeys implementation, where the rows are in one of the
> following formats:
>
> TXN_ID "/" NAME -> ("propset" VALUE AUTHOR DATE)
> TXN_ID "/" NAME -> ("propdel" AUTHOR DATE)
>
> Adding, changing, or deleting a property is as simple as tossing a
> new row into the table.
>
> Finding a property by name is a read of the last row in the set of
> rows matched by TXN_ID "/" NAME.
>
> Finding all properties for a transaction is a partial key lookup of
> TXN_ID "/", keeping only the last value per property name. That
> might not scale so well, though, but workarounds exist (add a new
> PROPNAMES list to the end of any transaction skels whose PROPS
> value is this new special 0-length atom value; get set of props by
> doing per-name lookups while iterating over property names.)
>
>
Heh heh, funny you both should mention that...

I have a more general solution in mind to patch the TimeSafe hole we
currently have. It too involves expanding the meaning of a transaction.
Currently, our filesystem looks like this

I propose to change this to something that looks like this:

  +-----------+ +-----------+
  | revision1 | | revision2 |
  +-----------+ +-----------+
    | ^ | ^
    v | v |
  +------+ +------+
  | txnX |---------------->| txnY |---->...
  +------+ +------+
      | +-------+ | +-------+
      +----->| treeX | +----->| treeY |
      | +-------+ | +-------+
      | +-----------+ | +-----------+
      +----->| txn props | +----->| txn props |
             +-----------+ +-----------+

Instead of being an list of immutable revisions, the filesystem would
become a list of immutable transactions, with revisions (revision
numbers) becoming nothin but a sparse index into this list.

Now, when you wanted to change the revprops of revision1, you'd create a
new transaction, txnZ, which would point to treeX, revision1, and its
own set of txn props; then you'd change the revision1 index entry to
point to txnZ instead of txnX. IN short, a revprop change becomes a
transaction just like a tree change would. (O.K., the details of this
cross-linking are still a bit fuzzy.)

The point is that when you ask the question, "what did the repository
look like at time so-and-so", you look for the transaction that was
committed at that time, and that will link back to the revision number.

I'm assuming for now that implementation constraints would forbid
changes to the tree associated with a particular revision, but that rule
could possibly be relaxed for certain kinds of node properties; that's
sheer speculation at this point.

I think this solution would be much cleaner than the proposed partial
hacks at the current schema. It is of course a 2.0 solution, but any
change in this direction would result in a number of protocol and client
API/UI changes in order to make it useful. I'm sure those could be
implemented in a backwards-compatible way, but I wonder if it makes
sense to do so.

-- Brane

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Jul 25 18:05:46 2005

This message: [ Message body ]
Next message: Ben Collins-Sussman: "Re: [Patch] Issue #2291 - 'svn ls -v' should return locking information - V4"
Previous message: Andrew Thompson: "Complete Mirror (was: Re: Subversion 1.2.1 released.)"
In reply to: C. Michael Pilato: "Re: "witty subject line about logging feature""
Next in thread: C. Michael Pilato: "Re: TimeSafe Subversion"
Reply: C. Michael Pilato: "Re: TimeSafe Subversion"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]