[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Locking design (was Re: svn commit: r9885 - trunk/notes)

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2004-05-27 05:56:12 CEST

(Apologies to everyone for the length. Many points of disagreement
here.)

On Wed, 2004-05-26 at 19:11, Branko ╚ibej wrote:
> >Write to a tmpfile, then move it.

> And add yet another level of file locking for multiprocess
> serialization. Why reinvent the wheel if we've already invented it
> _twice_ -- once using BDB, and once using FSFS?

Inter-process locking isn't such a tremendous wheel that we can only
locate functionality in the FS layer if it needs locking. We also have
locking in the wc layer, for instance.

In this case, if we use a flat file listing of locks, interprocess
serialization flows naturally from atomic replacement from a temporary
file and exclusive opening of that temporary file. If we use a
directory tree of locks, interprocess serialization flows naturally from
exclusive opening of the lock files.

> >Presumably svn_repos_fs_commit_txn(), will read/parse the locks-file
> >every time it is invoked. So it happens exactly once per commit.

> Oh fine. lock;parse;precheck;merge txn;update
> locks;write;fsync;unlock;ouch, someone did hot backup in between...

Where did all this complexity come from? There's no need to lock the
locks table before reading it. There's no need (I believe) to update
the locks table after committing the transaction.

> I've explained this before. Both ACLs and locks need a path-based lookup
> table that supports tree-based inheritance. The idea is to use _one_
> table for both, and a single lookup overhead for both, not reinvent the
> wheel yet a third time.

(1) Well, if we don't do directory locks, we certainly don't need
tree-based inheritance for locking.

(2) If we do directory locks, we still don't really need tree-based
inheritance for locking. We just have to look up the changed
directories as well as the changed files in the lock table.

(3) We need to maintain history on ACLs, but not on locks. In some ACL
designs, an ACL change might constitute a commit, whereas a lock or
unlock pretty much necessarily does not constitute a commit.

(4) ACLs are potentially more numerous and need to be checked in more
places than locks, but do not change as often.

> [On a related note, I'm beginning to wonder why we have a separate
> filesystem and repository layer.

Think of the filesystem layer as being something close to a
transactional filesystem implementation, and the repository layer as
being the server end of a version control system layered on top of that.

By separating out hooks, configuration, and potentially locks, an FS
back end can focus on just being a good transactional filesystem.

> Taking stuff out of the filesystem and putting it in the repos to keep
> the filesystem "clean" is weird, because the definition of "clean" seems
> to shift with the wind (for example, I suggested implementing revprop
> indexes in the repos layer, and ghudson objected. Then I suggested
> implementing locks in the FS, and ghudson objected... :-)

How does my opposition to adding two features to the FS layer constitute
"shifting with the wind?"

> Initially, the separation was supposed to be: FS features into the FS,
> repository features in repos. User identification is a repository
> feature. Hooks are a borderline case. Locks, ACLs and such are
> definitely FS features in my book, because they interact with versioning
> semantics. Anything that requires a path and/or version-based lookup is
> a prime candidate for the FS layer.]

That's a valid viewpoint. Certainly, permissions and locking are both
parts of native filesystems.

I suggested putting locks in the repository layer specifically because
they *don't* seem to interact with versioning semantics very much in our
current conception of them. They aren't versioned in and of themselves
(you can lock a file and unlock a file without making a commit, and
there's no record that you did anything except perhaps in server logs),
they only apply to the head, and if a commit is made which changes a
locked file, the lock remains.

(I've asked what happens if a file is deleted when a lock is held on it,
and got the answer from several people, "well, you can't delete a file
if you don't hold the lock." Which doesn't really answer my question;
what happens if you delete the file while you do hold the lock?)

> The "where" is a note for the users, just like the "why", and does not
> imply that the repository needs to keep track of working copies.

I'm aware that's how you meant it. But even if it's only communicated
to users, I don't want the repository keeping track of working copies in
any way. Where a working copy lives is a client-specific implementation
detail.

Keeping track of where the working copy is that locked a file opens the
door to wanting the server to be able to unreliably answer questions
like "where are the checkouts for this part of the repository" and
"where did the commits come from to this part of the repository." Sure,
someone could conceivably want that information, but it's not what we
do.

> >>If the user wants to manually unset the svn:lock-required
> >>property in order to evade a lock, that's akin to rebuilding the svn
> >>client to ignore it. We can't stop them, so there's no point in making
> >>it harder at the expense of increased user-visible complexity.

> It's this anarchistic tendency again... Let me put it this way. We, and
> many others, are using subversion in an environment where developers can
> be trusted to do the right thing. In such an environment, the
> free-for-all-by-default principle is valid. Now, in many corporate
> environments, you /can't/ trust the developers to DTRT (or the managers
> don't /want/ to), and therefore this has to be a repo admin's decision.
> Will you entertain the idea that we at least _allow_ changing this
> property to be restricted? Or do we want to lock svn out of this
> "market" by default?

Please be more specific. What are you trying to accomplish with this
restriction? If I unset the svn:lock-required property in my working
copy, it's not like that lets me commit a change to the file when
someone else is holding the lock. (If you don't want to let me change
the svn:lock-required property on the server, that sounds like an
argument for property-specific ACLs, not an argument for a complicated
server policy applying specifically to the svn:lock-required property.)

> There's also the issue of default behaviour. It's reasonable to require
> locking for binary files but not for text files by default, and to let
> people change this default -- again in a server config file. Elsewhere I
> suggested a mime-type-based decision, for example.

Once again, rather than implement complicated rules which apply
specifically to locking, we should be looking into general mechanisms
for automatically setting up properties. (Like auto-props, but more
useful.)

Switching over to <40B529A8.8000509@xbc.nu>:
>> We're going to run into corner cases like: I lock a file in a working
>> directory, make a copy of it, and then unlock it in one of the
>> working copies; the other one erroneously believes it holds a lock.

> That's no worse than breaking the lock any other way.

It is worse. The user has carried out only supported actions (lock a
file, copy a working copy, unlock a file) and is receiving non-ideal
behavior (the second working copy does not alert the user that editing
the file is a potential waste of time). It's not the end of the world;
it's just an illustration that without server control over working
copies, you can never do a particularly good implementation of exclusive
locks.

> I'm not "inventing" new uses for locks. I'm simply telling you how I
> know they're being used.

Well, from my point of view, it's requirements creep. We embarked on
this mission because users were telling us they have MS Word files or
graphic design files which can't be merged, and they want help
protecting themselves against wasting time on unmergeable edits. Nobody
mentioned anything about locking branches.

Directory locks seem like they introduce more corner case, are more
complicated from the user point of view, and will be harder to
implement, while providing functionality for use cases which are better
served by ACLs.

> I've spent the last three months modifying ACLs instead of using locks
> to block access to the release branch where locks would serve better,

Why would locks have served better?

>>>+ [Looking at this again, I find there's one big problem with using
>>>+ the repos layer for this. It means the lock table has to be
>>>+ exposed through the FS API.

>> I have no idea what you mean. here.

> I mean the lookup table should be part of the FS, of course.

I'm sorry, I can't connect the dots. It looked like you were trying to
make a specific argument that the lookup table should be part of the
FS. But I can't make any sense of your specific argument, and when I
ask you what it means, you simply repeat your conclusion.

>> So, Ben was theorizing that libsvn_repos could, without the help of
>> libsvn_fs_base, open a BDB table and use it. That's of course
>> totally unacceptable; we have two fully functioning FS back ends now,
>> and one of them isn't hamstrung by Berkeley BD's limitations.

> Exactly. The lookup table should be part of the FS implementation. :-)

That's only one of the acceptable alternatives.

(Incidentally, repeating yourself over and over again on this point,
without presenting any new arguments, as you did in
<40B52A9E.1040002@xbc.nu> in response to Ben, isn't really helping your
argument.)

[In regards to lock rationales:]
>> People may find this cumbersome; I don't think other version control
>> systems generally have it.

> They do.

Conceded. I was thinking of RCS, CVS, and (though I'm not familiar with
it and thus could be wrong about it) SourceSafe.

>> Why a global privilege? Why not a local one?

> Oh, that's related to the nature of ACL processing.

I think global ACL restrictions are as much of a trap as global
variables are in programs--they seem attractive until you suddenly have
a need for them to have different values in different contexts. But
since this thread is about locking and since we have no published ACL
design plan, that discussion doesn't really belong in this thread.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu May 27 05:56:26 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.