[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: ra-test.exe deadlock condition

From: Stefan Fuhrmann <stefanfuhrmann_at_alice-dsl.de>
Date: Sat, 4 Feb 2017 12:41:35 +0100

On 31.01.2017 10:09, Stefan wrote:
> Hi,
>
> I've been looking at the cause of a deadlock when running ra-test.exe
> with -fs-type=fsx (trunk version).
>
> The most important findings are summed up here atm [1].
>
> The issue was discussed with brane and danielsh on IRC (thanks for your
> time, once again).
>
> As far as my current understanding of the problem goes: the deadlock is
> caused by the fact that the apr_terminate() function registered in
> svn_cmdline_init() via the atexit-call is called after the termination
> of the threads which were created as part of the calls to
> apr_thread_pool_push() in svn_fs_x__batch_fsync_run().
>
> This means that apr's thread counter (thd_cnt) is getting out of sync
> (since the apr-function thread_pool_func() is not executed) and then
> gets stuck in thread_pool_cleanup() (waiting for the already terminated
> threads to be terminated).
>
> To me it looks like svnserve's main-function already contains a
> safeguard against a corresponding issue, and calls
> apr_thread_pool_destroy(threads) (or was this a completely different
> scenario?). This however does not cover the threads created from
> svn_fs_x__batch_fsync_run().
>
> Talking to danielsh and brane it became apparent to me that the issue
> might not be too obvious (in the end it might still be an issue on how I
> build SVN and therefore cause the atexit-registered apr_terminate()
> function to be called too late). It's also not fully clear to me at
> which exact point (in regards to registerd atexit()-calls) threads of
> the process are terminated if the process itself terminates. If indeed
> atexit()-registered functions get called after the threads are forcibly
> terminates (which to me it looks like it does atm) it might contradict
> the C(89/99) standard - see[2] 7.20.4.2/7.20.4.3. On the other side this
> thread on stackoverflow [3] suggests it's simply undefined (by the
> standard) what comes first.
>
> As danielsh suggested, I'm planning to come up with a plain minimal
> repro app only based on APR demonstrating the problem, so to make it
> more obvious (and double check for myself) what the issue is about.
>
> Regards,
> Stefan
>
> [1] http://www.luke1410.de:8090/browse/MAXSVN-94
> [2] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
> [3]
> https://stackoverflow.com/questions/39655868/what-does-the-posix-standard-say-about-thread-stacks-in-atexit-handlers-what
Hi Stefan,

I had a look at the code and found a possibly related problem.
If you are using DLLs, this might have affected you.

It would be nice if you could try r1781657 and see whether it
makes any difference in your case.

-- Stefan^2.
Received on 2017-02-04 12:41:52 CET

This is an archived mail posted to the Subversion Dev mailing list.