[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: mod_authz_svn and caching

From: Hans van Kranenburg <hans.van.kranenburg_at_mendix.com>
Date: Sat, 5 Nov 2016 16:39:44 +0100

Hi Stefan,

Thanks for the quick reply.

On 11/05/2016 11:44 AM, Stefan Fuhrmann wrote:
> Thanks for the issue report! There seem to be two independent
> approaches that we could take to address the problem -- see below.
> On 05.11.2016 00:19, Hans van Kranenburg wrote:
>> (please keep me in Cc: since I'm not on the list (yet))
>> [...]
>> Currently, I'm running a modified mod_authz_svn that simply disables
>> caching (not causing any noticable performance difference):
>> http://paste.debian.net/plainh/0b2f4e20
>> I understand that it's possible to create really complex authorization
>> structures inside a file for a single repository, so the caching of the
>> result of parsing the file might be useful in some cases.
> Looking at your workaround, I think the cleanest solution would be to
> introduce an AuthzSVNDisableCache option. Just grep in the same source
> file for AuthzSVNNoAuthWhenAnonymousAllowed and no_auth_when_anon_ok as
> an example for a boolean option. Use the new flag in get_access_conf()
> to optionally skip the cache access.

Ok, sounds good. I'm mainly a python programmer, and my C is currently
on the level of 'can read and understand, can modify and copy/paste
program', so that matches. :)

>> My suggestion about improving this situation would be to simply save the
>> modification time of the authz file with the cached information, and
>> check the mtime again on every request, invalidating the cache whenever
>> we notice that the file has changed.
> svnserve already has code for that. The idea is to use content SHA1
> instead of timestamps. The latter might be racy / have to low a
> resolution during the repo creation phase where an authz file could
> be created first and then filled shortly thereafter.
> SHA1, on the other hand, are immediately available for in-repository
> authz and even for external files, calculating the SHA1 is ~10x as
> fast as actually parsing the file.

Aha, so read the file anyway, and if the sha differs, continue do the
parsing step on the already read text.

I see the problem with the race condition with actual changing file
contents and mtime update.

I think changing the files in place is a bad idea in any case. For
changing the files, writing a new updated authz file first, and then
os.rename() it over the old one, should avoid these kind of problems.
But, as you mention, there can be more cases.

> The plan is to introduce a new authz implementation in, hopefully, 1.10
> and making the cache invalidation available everywhere is part of that.

Yes, if this way of working is used in multiple places, then it's a very
good reason to not do something else here.

>> I can give it a shot to prepare a patch to do this, if wanted.
> Patches are always welcome. I think adding AuthzSVNDisableCache
> should be easy enough for "giving it a shot".

I guess I have to target the development (to be 1.10) code for the
patch? I also guess that the cache invalidation change will have to lead
to deprecation/removal of the option again shortly thereafter, causing
extra cruft in the code.

So, does it make sense at all? I'm using Debian on the servers, with 1.8
now, and 1.9 in the next stable release, so I'll have to roll my own
packages anyway, for which the quick patch is good enough.

I also guess that ideally the tests need to be updated with cases which
show that authz file changes while doing requests are not seen when
caching is enabled and doing multiple requests uwing HTTP/1.1 keepalive
on a single connection, while they *are* seen when AuthzSVNDisableCache
is enabled. From what I see so far, this is not a trivial undertaking.


Hans van Kranenburg
Received on 2016-11-05 16:39:59 CET

This is an archived mail posted to the Subversion Dev mailing list.