Test suite doesn't detect httpd crashes
From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Wed, 17 Dec 2014 12:56:06 +0000
I have found that Apache 2.4.7 with the 'event' MPM sometimes crashes while running our test suite. I discussed with Philip and Ben yesterday and they let me know that this a known issue with several 2.4.x versions, and is fixed in later versions (>= 2.4.10 ?). The Subversion I'm currently testing with is trunk_at_1646184.
The problem I want to highlight here is that our test suite doesn't report these crashes. On some runs it reports that some individual tests failed, and overall failure, but on other runs it reports no individual tests failed, and overall success, even though Apache crashed.
Do we want to make our test suite detect these crashes, and report overall failure even if each of the individual test scenarios completed successfully?
It seems to me we should. What do you think?
After a run of externals_tests.py --parallel that reported success on all tests, the error_log first contains this:
[mpm_event:error] ... AH00484: server reached MaxRequestWorkers setting, consider raising the MaxRequestWorkers setting
which doesn't seem to be a problem in itself, as I always get this even if it doesn't go on to crash, and also get a similar message when using 'worker' MPM and that doesn't go on to crash either.
Then it may contain one or more messages like this:
[core:notice] ... AH00051: child pid 23555 exit signal Segmentation fault (11)...
At the moment I don't know if any error indication is being returned to some part of the test runner and then not reported as test failure, or if searching the error_log is the only way to detect these crashes.
Here is how I run the tests:
$ (cd obj-dir/ && rm -rf subversion/tests/cmdline/httpd-20141217-* && APACHE_MPM=event $SVN_SRC/subversion/tests/cmdline/davautocheck.sh externals --cleanup --parallel; echo "Test suite returned: $?"; grep -i "Segmentation fault\|mpm_.*error" subversion/tests/cmdline/httpd-20141217-*/error_log)
Here is the entire error_log for one run, after stripping out 'authz_' messages:
Occasionally the operating system's crash report dialogue has popped up on one of these runs, but in most runs it doesn't. I don't know why.
FWIW, when I enabled core dumping (ulimit -c 100000) and loaded a core dump into GDB I saw:
That was not the same test run that I reported above, and I haven't checked if the back-trace is similar every time.
This is an archived mail posted to the Subversion Dev mailing list.