[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: SoC: Path-based authz for Svnserve

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2005-06-29 20:22:26 CEST

On Wed, 2005-06-29 at 16:32 +0200, David Anderson wrote:
> - Is having a function in libsvn_repos that doesn't actually use a
> svn_repos_t acceptable, or should the authz routines move to another lib?

That's fine. Perhaps if mod_authz_svn didn't exist and we were doing
this from scratch, we'd have the libsvn_repos function take an
svn_repos_t and have a single way of finding the access control file,
but the die has been cast; HTTP servers find the access control file
using HTTP configuration, and we can't reasonably change that.

> - How do we cache the authz file (svn_config_t) in svnserve? Keep it
> loaded and reload on SIGUSR1 or similar?

I'd read it once per client connection. The structure of svnserve makes
that the easiest solution, I think. Remember that the access control
file is going to be different for different repositories, and that we
already read svnserve.conf once per client connection.

> mod_dav_svn
> already locates its own and caches the info, so no problem there.

I think you mean mod_authz_svn here.

> /**
> * Check wether @a user can access @a path in the repository @a
> * repos_name with the @a required_credentials. The access control
> * lists are supplied in @a cfg, and the updated access mask is
> * returned in @a *granted_access.
> *
> */
> svn_error_t *
> svn_repos_authz_check_access(svn_config_t *cfg, const char *repos_name,
> const char *path, const char *user;
> int required_access, int *granted_access,
> apr_pool_t *pool);
> ####

No documentation of required_access and granted_access, and I'd think
those should be enums, not bare ints. (I believe it's our style to use
enums rather than ints and #defines.)

> ===Alter mod_dav_svn to use the new routines===

I think you mean mod_authz_svn here. mod_dav_svn does not truck in
authorization files; it delegates authorization back to httpd. I don't
see any compelling reason to change that. I think you arrived at the
same conclusion in the subsequent paragraph, but your section title is
still weird.

> Svnserve on the other hand is
> longer-lived, and could probably load and cache the content of the authz
> file on a semi-permanent basis, to avoid the overhead of opening and
> parsing the file. //Should// it do so? Introducing configuration caching
> for authz files implies a reloading mechanism, via SIGUSR1 or a similar
> notification mechanism.

svnserve handles much larger-grain operations than mod_dav_svn, so
caching the authz file for the lifetime of a connection should yield
acceptable performance.

> One suggestion was to be able to distinguish "Any **authenticated**
> user" from "Any user at all, anonymous or authenticated", perhaps using
> * and ** to distinguish them. However, this seemed to pose some
> technical problems, as outlined by Greg Hudson in (
> http://svn.haxx.se/dev/archive-2005-05/0043.shtml ). I don't yet have
> sufficient grasp of the mechanism of updates to understand his argument,
> if someone could expand on this...

Update operations (as well as diff and switch and status operations)
involve a pipelined set of editing changes transmitted from the server
to the client. There is currently no room in the protocol for the
server to stop and challenge the client for authentication information.
This is true of both network RA layers and is designed to limit the
number of round trips involved in an update.

So, let's say I try to check out "trunk" and it turns out that, while
"trunk" is readable to anyone (authenticated or not), "trunk/foo/bar" is
read-protected. The way mod_dav_svn currently works, the server will
not challenge you for authentication when you get to trunk/foo/bar; it
will simply report an "absent" trunk/foo/bar directory and move on. As
I understand it, the normal workaround in DAV land is that if you want
to use read access controls, you make everyone authenticate, and you
create a guest user if you want anonymous read access to part of the
repository.

One option would be to look at the authz file at the beginning of the
checkout and say "Aha! I see a path entry for trunk/foo/bar, which is
read-protected. I bet I will run into that at some point during the
checkout. I'd better challenge the client for authentication now."
However, this can't really work if there's wildcard support in the authz
file.

> Currently and as far as I know, authz has a somewhat bad reputation of
> slowing down access to the repository, because of all the path checks
> that need to be performed.

I think most of the slowness comes from HTTP overhead, and not the
actual lookup. I'd wait until someone measures a performance problem
before worrying about this.

> The point being that if
> svnserve is made aware of that flag, it can build itself a cache lookup
> dictionary, with paths as keys and user/access rights as values. Any
> path that falls within one of the cached paths and whose requested
> rights are less or equal to the cached ones is granted access without
> diving into the authz parsing code.

The authz code doesn't have to *parse* anything to answer the question;
the authz file was loaded and parsed into a nested set of hashes at
connection initiation time. I think if you're not careful, you'll wind
up designing a cache which is precisely as slow as the libsvn_repos
function you'd be calling on a cache miss.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Jun 29 20:23:55 2005

This is an archived mail posted to the Subversion Dev mailing list.