[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Handling assertions and malfunctions in mod_dav_svn

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Tue, 17 Feb 2015 23:40:30 +0000

Evgeny Kotkov wrote on Tue, Feb 17, 2015 at 20:44:29 +0300:
> - We attempt to solve "B) Taking down other threads" in 1.10 by carefully
> examining the calling sites, how the caching behaves, etc., and aim towards
> a guarantee that nothing will break with a non-abortive malfunction handler.
> I am actually interested in making this happen.

What can we do if an assertion fails inside the cache implementation?

I see three options: log it and continue; continue with cache disabled;
abort. However, neither option is necessarily safe:

- The first might cause data loss if the assertion was ultimately caused
  by, say, faulty hardware. (If we have faulty hardware, aborting could
  actually be the best option, since it prevents non-mod_dav_svn threads
  from experiencing data losses.)

- The second might lead to unacceptable performance degradation.

- The third will take down non-mod_dav_svn threads too, resulting in
  denial of service for the requests those threads were handling.

How do we choose between three risks? Is this perhaps a place where
it's justified to create a knob for the admin to set according to her
own risk analysis?

In general, we treat assertions as an all-or-nothing deal.  The
malfunction handler is process-global, and it either always raises or
always aborts.  I keep wondering if we shouldn't give malfunction
handlers more information about the failure mode that was observed —
e.g., where the assertion was invoked, why it was invoked, and how risky
the assertion site deems it to continue — so as to let the malfunction
handler make more informed abort-or-raise decisions.
For example, assertions that verify preconditions often mean there is
a bug in the caller but continued execution would be safe, whereas
assertions in the bowels of svn_cache_* are more worrisome.  But we use
the same SVN_ERR_ASSERT() for both,¹ and the malfunction handler can't
tell the difference.
(Or maybe we just shouldn't be using SVN_ERR_ASSERT() for preconditions,
but that's another way of saying that, yes, we should be making
a distinction between failed preconditions and failed cache bowel
¹ Examples: svn_fs_create_access(username=NULL) and
Received on 2015-02-18 00:45:17 CET

This is an archived mail posted to the Subversion Dev mailing list.