On 17 February 2015 at 20:44, Evgeny Kotkov <evgeny.kotkov_at_visualsvn.com> wrote:
> Ben Reser <ben_at_reser.org> writes:
>
>> If we're going to make this behavior optional I'd rather it just be a compile
>> time directive rather than a run time directive. Most users aren't going to
>> understand what this means and I can't imagine there would be very many
>> scenarios where run time configuration is desirable. So I'd discard the 3rd
>> option entirely.
>
> I was thinking that the runtime option could be useful, because the desired
> behavior might depend on the MPM being used. With non-threaded pre-forking
> servers, problems with a single request wouldn't affect other requests [1].
> Threaded servers, however, are those that could benefit from this kind of
> option. MPMs can be built as DSO modules and dynamically loaded into
> the server, hence adjusting the compile-time directive accordingly could
> be painful.
>
>> Why not just audit the svn_error_clear() calls for unconditional calls (i.e.
>> those that don't only clear specific errors). The ones that we find that are
>> necessary can be converted such that assertion errors trigger an abort().
>> Maybe we can come up with a macro that tests an svn_error_t for an assertion
>> error and aborts if so. Something like:
>> [[[
>> void foobar(void)
>> {
>> svn_error_t *err = svn_foo();
>> SVN_ERR_ABORT_ON_ASSERT(err);
>> svn_error_clear();
>> bar();
>> }
>> ]]]
>>
>> From then on out consider unconditional svn_error_clear() calls illegal
>> and document them as such in our Community Guide. Also link to that
>> documentation from our malfunction handler documentation so that 3rd parties
>> know about this requirement. They might ignore it, but they already can get
>> into trouble if they ignore some of our usage assumptions.
>>
>> I think a potentially bigger problem might be what to do about cache state.
>> Cache reads are obviously not a problem. Theoretically, a cache write
>> shouldn't happen until after a repository modification so that if we fail the
>> repository write then we shouldn't end up with stale cache data. The cache
>> design to be thread safe should be the same as what we need for this. The
>> only thing I'm fuzzy on is if it's possible to have an assertion in the
>> middle of a cache write that creates invalid state. This sort of scenario is
>> the type of thing that using something like memcached for caching would
>> illuminate already but I don't think very many people use that support and
>> instead almost all of our caches are using per process caches, that might
>> have such problems. Solving this would be a lot harder than simply auditing
>> the clearing of errors, it'd be a lot of careful review of cache writes and
>> you have to ensure this behavior gets maintained forever (the error clearing
>> requires that too but it's a fairly simple rule to follow).
>>
>> This ultimately is something we have to solve, if we want to use a central
>> server model in the future in order to have a central cache, we can't be
>> aborting. So while I think doing this right might be slightly painful, long
>> term it'll pay off.
>
> I like where this is heading (suppose we might as well make a couple of
> interesting findings), so here is what I propose:
>
> - We solve the "A) Logging" part of the problem right now in trunk. The patch
> is quite trivial and doesn't change the existing abortive behavior. So, we
> would be basically doing the same as in Subversion 1.8, but with appropriate
> logging.
>
> I am thinking about doing this:
> [[[
> Index: subversion/mod_dav_svn/mod_dav_svn.c
> ===================================================================
> --- subversion/mod_dav_svn/mod_dav_svn.c (revision 1660432)
> +++ subversion/mod_dav_svn/mod_dav_svn.c (working copy)
> @@ -143,6 +143,25 @@ init(apr_pool_t *p, apr_pool_t *plog, apr_pool_t *
> return OK;
> }
>
> +static svn_error_t *
> +malfunction_handler(svn_boolean_t can_return,
> + const char *file, int line,
> + const char *expr)
> +{
> + if (expr)
> + ap_log_error(APLOG_MARK, APLOG_CRIT, 0, NULL,
> + "mod_dav_svn: file '%s', line %d, assertion \"%s\" failed",
> + file, line, expr);
> + else
> + ap_log_error(APLOG_MARK, APLOG_CRIT, 0, NULL,
> + "mod_dav_svn: file '%s', line %d, internal malfunction",
> + file, line);
> + abort();
> +
> + /* Should not be reached. */
> + return SVN_NO_ERROR;
> +}
> +
> static int
> init_dso(apr_pool_t *pconf, apr_pool_t *plog, apr_pool_t *ptemp)
> {
> @@ -162,6 +181,8 @@ init_dso(apr_pool_t *pconf, apr_pool_t *plog, apr_
> return HTTP_INTERNAL_SERVER_ERROR;
> }
>
> + svn_error_set_malfunction_handler(malfunction_handler);
> +
> return OK;
> }
> ]]]
>
> - We attempt to solve "B) Taking down other threads" in 1.10 by carefully
> examining the calling sites, how the caching behaves, etc., and aim towards
> a guarantee that nothing will break with a non-abortive malfunction handler.
> I am actually interested in making this happen.
>
> How does this sound?
>
+1, sounds like a good plan.
--
Ivan Zhakov
Received on 2015-02-17 19:07:47 CET