[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

[PATCH] speed up svn_repos_authz_check_access

From: Roderich Schupp <roderich.schupp_at_gmail.com>
Date: Wed, 7 Aug 2013 08:27:43 +0200

Hi,

this patch attempts to speed up svn_repos_authz_check_access, esp. when it
is called
repeatedly during the same HTTP request (or on the same connection).
Subversion issues
many HTTP requests that result only in a single call to
svn_repos_authz_check_access
(i.e. just for the path given in the request itself). Others may call
svn_repos_authz_check_access
lots of times, e.g "svn log" calls it for every modified path in every
revision in the
requested range of revisions.

This patch reduces the cumulative time for svn_repos_authz_check_access
(when called repeteadly in the same connection) for more than 50%
(e.g. running the attached test program on a large body of paths).

The observations behind the patch are:

(1) The functions like authz_get_path_access in libsvn_repos/authz.c that
do the actual work all use

  svn_config_enumerate2(..., path, authz_parse_line, baton, ...)

to compute allow/deny information for the given path and user (implicit in
baton).
The result is computed over and over again, even if the same path and user
are specified than in a previous call.

(2) In "real life", i.e. in the Apache server, user has always the same
value for
the same request (even the same connection).

The patch augments the svn_authz_t struct wih a field "cache" which
is a apr_hash_t, mapping paths to already known allow/deny information.
(This cache is obviously correct only for the same value of user, so we
store that in another field "cached_user"; if the value of user changes,
we simply throw away the existing cache).
We only store a "path" in the cache if it has a [path] section in the
actual access control file, so the cache cannot grow larger than
the apr_hash_t used to store that.

The patch then replaces all of the above calls to svn_config_enumerate2
with a wrapper function, that first checks for a cached value.
Otherwise it will call svn_config_enumerate2 and cache the result.

It also changes the internal functions
- authz_get_path_access
- authz_get_any_access
- authz_get_tree_access
to take a svn_authz_t* as the first parameter (instead of svn_config_t*)
so that they have access to both the svn_config_t and the cache.

The patch needs to #include libsvn_subr/config_impl.h
in order to gain access to svn_config_t.pool:
the cache (apr_hash_t itself, keys and values) must be allocated from
the same pool as svn_config_t so that they have the same lifespan.

Cheers, Roderich

Received on 2013-08-07 08:28:22 CEST

This is an archived mail posted to the Subversion Dev mailing list.