Over the past few years, I've found myself moaning about Subversion's
security model. Issue #3242 has finally driven me to post about it.
The Subversion security model is pretty simple. You have paths, you have
the ability to say whether those paths are readable, writable, both, or
neither by anyone or any-specific-one.
The Subversion change transmission model is also pretty simple. You have
the editor (svn_delta_editor_t) concept, which provides an interface for
depth-first tree traversal, query, and modification.
Finally, we have atomic commits.
By themselves, these things are all Just Fine. When put together, we start
to see problems.
The binding of a single editor drive with the atomic commit concept for
commits means that for a single commit, you must use a single editor drive
to touch not just all the things you want to change in the commit, but all
the parents of those things up to and including their longest common
ancestor path. That becomes a problem when our security model's idea of
"read access" is applied.
Consider this operation:
svn move ^/A/B/E/alpha ^/A/D/G/alpha
The editor drive required to do this as an atomic operation is something
like the following:
# do file stuff
This works fine ... unless you find yourself lacking "read" permission on
"/A". Even if you are permitted to read /A/B/* and /A/D/*, Subversion
croaks because it is trying to open "/A". The funny thing is, if you break
the move into two commits (a copy and a delete), the editor drives are like
# do file stuff
"/A" isn't even consulted, so Subversion is happy. You've done effectively
the same thing -- copied a file from one place to another, and then deleted
the original. You just used two revisions to do it. It's the atomicity
requirement, as enforced by the use of the editor tree drive, that breaks
stuff for you.
Folks, this is wrong.
Now, we can point the finger at any number of places here. Some possibilies
* Shame on the editor for requiring a depth-first tree drive that
includes walking uninteresting parent directories.
* Shame on the RA commit mechanism for binding an atomic commit
operation to but a single editor drive.
But here's where I think the real finger-pointing is best aimed:
* Shame on the Subversion security model for not distinguishing
between "you may know that PATH exists" and "you may see PATH's
history, content and metadata".
CollabNet Enterprise Edition's Subversion integration uses an Apache
authorization module that is similar to our stock mod_authz_svn, except that
it uses regular expressions for its rules instead of path prefixes. As
such, it suffers in all the same ways that mod_authz_svn does when problems
like those in issue #3242 crop up.
But our ViewVC integration takes a different approach. It actually
distinguishes between "you can know that PATH exists" and "you can see
PATH's history, content, metadata, etc." The key here is that "you can know
that PATH exists" is answered affirmatively if you have read access to PATH
*or* if you have read access to any of its children. So when you browse our
ViewVC tool, you can see all the stuff you have read access to *plus* the
skeletal parent directory structure required to navigate to those items.
ViewVC will still hide revision metadata and such from those parent
directories, of course. This works great, keeps folks from having to
hand-type ViewVC URLs (they can always navigate from the root of the
repository), and I would argue leaks no information. I mean, if someone
tells you that you have access to "/A/B/D", is it considered a leak of
information to say that both "/A" and "/A/B" exist?
I believe strongly that Subversion could benefit from a switch to this
slightly more relaxed authorization model.
In terms of changes to mod_authz_svn, I would expect a bare minimum: since
it works with path-math anyway, it's a simple matter of string prefix
comparison to determine if you can read something only by virtue of being
able to read one of its [grand]children. In fact, at CollabNet we've even
experimented with exactly such a tweak, mostly with success. The results
were awesome -- you could checkout the root of a repository and only get the
items you had read access to, *even if you didn't have read access to root*.
The benefits to users are obvious -- no more telling different folks to
checkout different portions of the tree because they all have different read
But there were two showstoppers with our naive implementation:
1. First, Subversion has no way to look at the response to a "can I
read this?" query of the auth subsystem and know if the answer means
"Yes, you can read it and know everything about it", or just "You
can know it exists because (duh) you can read its children."
This means that we couldn't blot out metadata from directories
deemed readable solely because of the accessibility of their children.
2. Our current security concession -- the absent-entries leak --
gets arguably out of hand. Today, Subversion leaks the name of
any unreadable immediate children of a readable directory. But
if "readable" means "you can read at least some deep child of it",
then you find yourself leaking siblings all along the chain of
otherwise unreadable parent directories.
I'm sorry to say that I don't have ready solutions to these issues. But I
wanted to begin a dialogue on this topic. Do others see the problems I see?
Do others see the same root causes that I see? Do others have recommended
solutions that I'm not yet seeing?
 Ideally, that is -- issue #3242 exists primary because Subversion
isn't even doing the ideal thing here.
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand
Received on 2009-03-10 19:36:39 CET