[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

lock scalability

From: Ben Collins-Sussman <sussman_at_collab.net>
Date: 2005-03-14 20:17:09 CET

A bit of a general meta-problem with the new locking feature, that I
want people to be aware of:

In general, our locking design has been done with the "occasional lock"
scenario in mind. That is: it's been designed for the way our own
project would use the feature -- somebody might lock a binary file
every once in a while.

But there's a whole class of users who will want to lock hundreds (or
thousands) of files at once, especially those migrating from other VC
systems. In cmpilato's words: "the moment we decided not to implement
directory locking is the moment we started having to deal with locking
thousands of files at once."

But we're just flat-out not efficient at this scenario right now:

    * just like property lists, both sides of the network currently hold
unbounded lists of lock-objects in memory.

    * although fitz just upgraded svn_client_(un)lock() and
svn_ra_(un)lock() to take lists of paths as arguments, the network
layers aren't yet sending the entire list over the network. We've
tentatively decided to put that off till 1.3. (In the case of ra_dav,
we'd need to tweak mod_dav itself and get a new apache released to fix
this.)

All in all, I'm not losing a lot of sleep over this problem. I think
we'll gradually be able to evolve our implementations and APIs to make
the "thousands of locks" scenario more efficient over time.

But in this light, I'd like to propose one more tweak. We have 4 new
hooks. 'pre-lock' and 'pre-unlock' can return FALSE to block a
lock/unlock action, as you'd expect. We also have 'post-lock' and
'post-unlock' to send notification of a lock/unlock action that has
already happened. At the moment, svn_repos_fs_(un)lock() takes one
path, executes the pre-hook, then svn_fs_lock(), then the post-hook.

Given that svn_repos_fs_(un)lock() will probably -- eventually -- be
receiving a whole list of locks from the network, cmpilato and I are
wondering if it makes sense to define the post-(un)lock hooks as taking
a list of paths *right now*, rather than one path. If somebody decides
to lock 5000 things at once, it's important to get one email, not 5000
of them.

We've had experience adding new APIs in the past, but never been faced
with having to rev the API of a hook script. So I'm wondering if
post-(un)lock should be tweaked now, in preparation for the future?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Mar 14 20:18:26 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.