[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Handling assertions and malfunctions in mod_dav_svn

From: Ben Reser <ben_at_reser.org>
Date: Mon, 16 Feb 2015 18:44:04 -0800

On 2/16/15 5:21 PM, Evgeny Kotkov wrote:
> 2) Install a non-abortive mod_dav_svn malfunction handler, which would log the
> malfunction details using ap_log_error() and return SVN_ERR_ASSERTION_FAIL
> when CAN_RETURN = TRUE. This option is similiar to the current behavior of
> a non-default malfunction handler, svn_error_raise_on_malfunction().
>
> (See svn_error_malfunction_handler_t for details about CAN_RETURN argument)
>
> Pros:
> Doing so would address both A) and a huge subset of B). Only a subset,
> because there still are situations with CAN_RETURN = FALSE when we really
> cannot do something graceful in a malfunction handler rather than abort().
>
> Cons:
> Our codebase currently includes ~ 300 assertions, which might eventually get
> triggered by mod_dav_svn. We might encounter situations with the callers
> ignoring an SVN_ERR_ASSERTION_FAIL. Say, they could svn_error_clear()
> it. Or turn it into dav_error *, and then mod_dav would ignore this error
> for some reasons and continue to call us as if nothing happened. This could
> result in us running in an undefined state — i.e., where abort() would have
> probably been a better choice.
>
> 3) Add a SVNAbortOnMalfunction directive (default is On). Install different
> malfunction handlers based on the value of this flag. Describe what's the
> purpose of this directive and announce an intention to change the default to
> Off in the next minor release. Let it soak, and change the default.
>
> Pros:
> Same as 2), but an administrator can make a better choice based on his/her
> Apache configuration.
>
> Cons:
> Requires time. Does not address B) with the default value.
>
>
> Thoughts?

If we're going to make this behavior optional I'd rather it just be a compile
time directive rather than a run time directive. Most users aren't going to
understand what this means and I can't imagine there would be very many
scenarios where run time configuration is desirable. So I'd discard the 3rd
option entirely.

Why not just audit the svn_error_clear() calls for unconditional calls (i.e.
those that don't only clear specific errors). The ones that we find that are
necessary can be converted such that assertion errors trigger an abort().
Maybe we can come up with a macro that tests an svn_error_t for an assertion
error and aborts if so. Something like:
[[[
void foobar(void)
{
  svn_error_t *err = svn_foo();
  SVN_ERR_ABORT_ON_ASSERT(err);
  svn_error_clear();
  bar();
}
]]]

From then on out consider unconditional svn_error_clear() calls illegal and
document them as such in our Community Guide. Also link to that documentation
from our malfunction handler documentation so that 3rd parties know about this
requirement. They might ignore it, but they already can get into trouble if
they ignore some of our usage assumptions.

I think a potentially bigger problem might be what to do about cache state.
Cache reads are obviously not a problem. Theoretically, a cache write
shouldn't happen until after a repository modification so that if we fail the
repository write then we shouldn't end up with stale cache data. The cache
design to be thread safe should be the same as what we need for this. The only
thing I'm fuzzy on is if it's possible to have an assertion in the middle of a
cache write that creates invalid state. This sort of scenario is the type of
thing that using something like memcached for caching would illuminate already
but I don't think very many people use that support and instead almost all of
our caches are using per process caches, that might have such problems.
Solving this would be a lot harder than simply auditing the clearing of errors,
it'd be a lot of careful review of cache writes and you have to ensure this
behavior gets maintained forever (the error clearing requires that too but it's
a fairly simple rule to follow).

This ultimately is something we have to solve, if we want to use a central
server model in the future in order to have a central cache, we can't be
aborting. So while I think doing this right might be slightly painful, long
term it'll pay off.
Received on 2015-02-17 03:44:35 CET

This is an archived mail posted to the Subversion Dev mailing list.