[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Handling assertions and malfunctions in mod_dav_svn

From: Evgeny Kotkov <evgeny.kotkov_at_visualsvn.com>
Date: Tue, 17 Feb 2015 20:44:29 +0300

Ben Reser <ben_at_reser.org> writes:

> If we're going to make this behavior optional I'd rather it just be a compile
> time directive rather than a run time directive. Most users aren't going to
> understand what this means and I can't imagine there would be very many
> scenarios where run time configuration is desirable. So I'd discard the 3rd
> option entirely.

I was thinking that the runtime option could be useful, because the desired
behavior might depend on the MPM being used. With non-threaded pre-forking
servers, problems with a single request wouldn't affect other requests [1].
Threaded servers, however, are those that could benefit from this kind of
option. MPMs can be built as DSO modules and dynamically loaded into
the server, hence adjusting the compile-time directive accordingly could
be painful.

> Why not just audit the svn_error_clear() calls for unconditional calls (i.e.
> those that don't only clear specific errors). The ones that we find that are
> necessary can be converted such that assertion errors trigger an abort().
> Maybe we can come up with a macro that tests an svn_error_t for an assertion
> error and aborts if so. Something like:
> [[[
> void foobar(void)
> {
> svn_error_t *err = svn_foo();
> SVN_ERR_ABORT_ON_ASSERT(err);
> svn_error_clear();
> bar();
> }
> ]]]
>
> From then on out consider unconditional svn_error_clear() calls illegal
> and document them as such in our Community Guide. Also link to that
> documentation from our malfunction handler documentation so that 3rd parties
> know about this requirement. They might ignore it, but they already can get
> into trouble if they ignore some of our usage assumptions.
>
> I think a potentially bigger problem might be what to do about cache state.
> Cache reads are obviously not a problem. Theoretically, a cache write
> shouldn't happen until after a repository modification so that if we fail the
> repository write then we shouldn't end up with stale cache data. The cache
> design to be thread safe should be the same as what we need for this. The
> only thing I'm fuzzy on is if it's possible to have an assertion in the
> middle of a cache write that creates invalid state. This sort of scenario is
> the type of thing that using something like memcached for caching would
> illuminate already but I don't think very many people use that support and
> instead almost all of our caches are using per process caches, that might
> have such problems. Solving this would be a lot harder than simply auditing
> the clearing of errors, it'd be a lot of careful review of cache writes and
> you have to ensure this behavior gets maintained forever (the error clearing
> requires that too but it's a fairly simple rule to follow).
>
> This ultimately is something we have to solve, if we want to use a central
> server model in the future in order to have a central cache, we can't be
> aborting. So while I think doing this right might be slightly painful, long
> term it'll pay off.

I like where this is heading (suppose we might as well make a couple of
interesting findings), so here is what I propose:

- We solve the "A) Logging" part of the problem right now in trunk. The patch
  is quite trivial and doesn't change the existing abortive behavior. So, we
  would be basically doing the same as in Subversion 1.8, but with appropriate
  logging.

  I am thinking about doing this:
[[[
Index: subversion/mod_dav_svn/mod_dav_svn.c
===================================================================
--- subversion/mod_dav_svn/mod_dav_svn.c (revision 1660432)
+++ subversion/mod_dav_svn/mod_dav_svn.c (working copy)
@@ -143,6 +143,25 @@ init(apr_pool_t *p, apr_pool_t *plog, apr_pool_t *
   return OK;
 }

+static svn_error_t *
+malfunction_handler(svn_boolean_t can_return,
+ const char *file, int line,
+ const char *expr)
+{
+ if (expr)
+ ap_log_error(APLOG_MARK, APLOG_CRIT, 0, NULL,
+ "mod_dav_svn: file '%s', line %d, assertion \"%s\" failed",
+ file, line, expr);
+ else
+ ap_log_error(APLOG_MARK, APLOG_CRIT, 0, NULL,
+ "mod_dav_svn: file '%s', line %d, internal malfunction",
+ file, line);
+ abort();
+
+ /* Should not be reached. */
+ return SVN_NO_ERROR;
+}
+
 static int
 init_dso(apr_pool_t *pconf, apr_pool_t *plog, apr_pool_t *ptemp)
 {
@@ -162,6 +181,8 @@ init_dso(apr_pool_t *pconf, apr_pool_t *plog, apr_
       return HTTP_INTERNAL_SERVER_ERROR;
     }

+ svn_error_set_malfunction_handler(malfunction_handler);
+
   return OK;
 }
]]]

- We attempt to solve "B) Taking down other threads" in 1.10 by carefully
  examining the calling sites, how the caching behaves, etc., and aim towards
  a guarantee that nothing will break with a non-abortive malfunction handler.
  I am actually interested in making this happen.

How does this sound?

[1] https://httpd.apache.org/docs/current/mod/prefork.html

Regards,
Evgeny Kotkov
Received on 2015-02-17 18:45:17 CET

This is an archived mail posted to the Subversion Dev mailing list.