[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Bugs in 1.8.5 libsvn_subr which make 32-bit threaded svnserve sometimes hang and eat 100% cpu

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Thu, 13 Feb 2014 20:26:18 +0100

On Thu, Feb 13, 2014 at 12:49 PM, <vitalif_at_yourcmc.ru> wrote:

> Hi!
> I want to report two bugs in Subversion 1.8.5 libsvn_subr
> subversion/libsvn_subr/cache-membuffer.c, a more important one and a less
> important one. I've fixed them, and patches are attached - please apply
> them.

Hi Vitali,

Thank you very much for taking the time to debug the hangs
and to come up with a patch.

It also tells me that the caching code can still be understood.

> First bug (important):
> There is an incorrect integer addition to cache->data_used which makes
> 32-bit threaded svnserve sometimes hang and eat 100% CPU. When it hangs,
> one thread is doing an infinite loop in ensure_data_insertable() with write
> lock acquired, and others just wait for it to release the lock. So it looks
> like 100% usage of a single CPU core and non-working svn:// access.
> The problem is in line 1377: "cache->data_used += size - entry->size;" is
> incorrect because cache->data_used is always apr_uint64_t, and size and
> entry->size are size_t which is unsigned 32-bit integer type on 32-bit
> platforms. And size is always [almost always] less than entry->size at that
> line, so cache->data_used becomes > 0x100000000 at once which makes
> ensure_data_insertable() to loop infinitely because all entries in cache
> "become" less than 1/8 * average entry size.
> The patch is in the attachment 'fix-membuffer-u32-overflow'

Fixed in r1567985 using Peter's suggestion.

> Second bug (less important):
> There is also another bug in libsvn_subr membuffer cache implementation:
> cache->hit_count and entry->hit_count are incremented without proper mutual
> exclusion (only with read lock acquired), so some increments get lost in
> multithreaded svnserve; as it's unlikely two threads access the same entry
> at once, usually cache->hit_count (not entry->hit_count) increments are
> lost, which leads to cache->hit_count being less than sum(entry->hit_count).
> So cache->hit_count may overflow during subtraction of entry->hit_count
> from it and lead to removing of extra values from the cache in
> ensure_data_insertable(), because the retain threshold may be very big;
> also it may overflow again, become less than cache->entries_used and also
> lead to removing of extra items.
> Instead of killing the performance by mutual exclusion I think it's
> possible to just check for integer overflow.

This one is more subtle. My patch for it is in r1567996
with a follow-up in 1568009. It took me a while to see
that your proposal is actually a stable solution, i.e. not
likely to make the global counter value drift over time.

I also fixed a related issue with 32 bit hit counter overflows
(rare but may happen after some time for large directories
or items close to the root of the access path).

> This doesn't affect general svnserve usability, so severity is minor.

These problems may affect svnserve in threaded more
more than httpd but the latter should still exhibit the
infinite loop behaviour.

> The patch is in the attachment 'fix-membuffer-hit_count-overflow'
> Hope you'll apply these patches soon, or fix the bugs some another way :)

Well, the good news is that 1.7 is not affected (all access
is fully synchronized there) nor 1.9 / trunk. The latter uses
a more sophisticated scheme and will exit the critical loop
after a few iterations without affecting overall performance.

Thanks again for the report!

-- Stefan^2.
Received on 2014-02-13 20:26:54 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.