On Sat, Dec 27, 2008 at 12:26, Hyrum K. Wright
> Greg Stein wrote:
>> I don't understand why transactions would be exposed to users of this
>> API. Each API "should" leave the db in a consistent state. If each API
>> does *not*, then you are placing a burden on the caller to make
>> specific sets of calls in a specific order, which is usually hard to
>> document, so you end up with unwritten assumptions.
> Agreed, but I'm not yet sure of the level of modularity and consistency we want
> or need in the working copy. SQLite performs much better with the use of
> transactions, and using large transactions is a potential performance win for
Agreed. Though I would phrase it as "deferred commit, after N
operations". When we need a *transaction*, then we'll certainly be
doing that. You're referring more to disabling autocommit, and then
bulk-committing a bunch of work. Correct?
> people running on remote filesystems (think NFS). There is a balance to be
> struck between one giant do-everything API and many individual
> do-little-things-but-keep-the-wc-consistent API, and I'm not yet sure where that
> should be.
Agreed. My hope was that each API function would leave the WC
consistent. For example, let's say that we're doing a checkout, and an
"add file" arrives, so we call that function, and it marks that props
and contents have not (yet) arrived. The setprops API is called, and
the one flag is cleared. The contents arrive, and the other flag is
cleared. At each point, we have the data recorded so the db is
consistent, if not at a semantic level ("oops. this file has been
defined, but its props are missing" can be deterministically
detected). (note that we'll need something like this since these three
pieces of info can arrive at *very* different times during the control
But as you say: that is a lot of disk writes, so (internally) we can
have something that bundles up (say) 100 operations before flushing to
disk. That can happen internally, rather than at the direction of the
That said, allowing the client to provide hints would be alright with
me. "I think this directory is done. Flush it now, so that a
significant/complete chunk is complete if an error stops some later
processing." For example, we don't want a whole checkout to be one
mother all-or-nothing transaction.
I think the API will be easier to use if a function can be called and
you *know* that it is possible to exit without corrupting the WC. That
it will "just work".
>> Optimizations around our use of SQLite should preferably be internal.
> Sure. But should the notion of a transaction live outside the wc_db APIs?
> Whether these are implemented as SQLite transactions or by loggy, we may need
> some way of ensure macro consistency (though I'll admit to not having such a
> scenario on hand).
The post-commit operation of updating BASE for a file needs to be a
transaction. But I believe the API will be something like "make ACTUAL
<file> the BASE with revision <rev>". Or we might have something like
"<changelist> has been committed. install its committables into BASE
as <rev>". Dunno where the granularity is/should be, and how that
is/should interact with a new loggy system.
> Having written that, I just thought of the following question. Do we try and
> keep things consistent on a client-api level ('svn ps -R' is an all or nothing
> operations') or at the working copy level (each propset either succeeds or fails)?
Today, a recursive propset can fail partway through, leaving some
properties changed, and others untouched. We now have the ability to
transact that entire operation, but I don't think we're *required* to.
It is an interesting question!
p.s. after I finish the tree conflict skel code, I think that I need
to spend some good time with the design doc and wc_db to brain dump
all of our latest discussions and thoughts. we've built up quite a bit
over email and IRC...
Received on 2008-12-28 05:36:32 CET