On Thu, Jan 13, 2011 at 09:03:10AM -0500, C. Michael Pilato wrote:
> On 01/13/2011 08:25 AM, Stefan Sperling wrote:
> > On Wed, Jan 12, 2011 at 09:46:47AM -0500, C. Michael Pilato wrote:
> >> Nothing to add to the immediate discussion at this time. Just wanted to
> >> note that we have essentially the same problems to deal with in the BDB
> >> backend. The solution we've taken there is "don't drive public API
> >> callbacks from inside BDB transactions". It's not always the best for
> >> performance, but the FS API is such a high profile API for third-party
> >> consumers (very unlike the WC API in that respect, I'd imagine) that
> >> simplicity of the API "rules" was valued.
> >>
> >
> > This is about callbacks in the libsvn_client(!) and libsvn_wc APIs.
> > Those are widely used by third parties so we should really make
> > the right decision.
> >
> > Do we trust third parties to heed restrictions we place on callbacks?
> > There is no way can enforce the restrictions.
>
> Really? Our BDB code detects re-entry from within a transaction and errors
> out. :-) It does this by keeping a simple boolean state variable that is
> set whenever the code enters a BDB transaction, and cleared when the
> transaction is committed or aborted. If that variable is set, and some
> other code comes along to create a new (nested) transaction, and error is
> raised.
>
> Could we do something similar in our SQL logic?
This is a nice idea. I'll think about it.
> > Or we could decide not to run callbacks while sqlite transactions
> > are in progress, and pay some performance overhead. We'd have to
> > run multiple queries and provide data to the callback in chunks
> > (of, say, directory-sized units). I'm not sure if this would provide
> > adequate performance, but I haven't taken any measurements either (yet).
> > A recent change of this kind I made to the node walker didn't make
> > to make any notable difference.
>
> So, yeah, we really need more information to answer questions of
> performance, it seems. In the meantime, we need to answer questions of
> correctness, maintainability, and ease-of-use.
There is no question that calling callbacks while sqlite transactions
are running provides best performance. Instead of crawling a tree on
disk and running a query for each node (or a set of nodes), we can
simply "walk" the db and pass information to the callback.
This only works for read-only operations like proplist and status,
of course, but those are the ones that need to perform really well
so they can efficiently be used by GUIs, for instance.
> > We may need both approaches anyway, if we decide to use the second
> > approach in backwards-compat code.
>
> I don't see how we'd ever need both approaches. Can you explain?
If we decide that the new approach violates our backwards compat rules,
we'd have to provide a separate implementation to continue supporting
semantics of old APIs.
Stefan
Received on 2011-01-13 16:11:26 CET