Re: Kernel CPU load on Apache/2.4.10 with SVN 1.8.10 (r1615264)

From: Stefan Fuhrmann <stefanfuhrmann_at_alice-dsl.de>
Date: Thu, 18 Dec 2014 18:11:47 +0100

On 17.12.2014 15:29, Charlie Smurthwaite wrote:

Hi Charlie,

Thanks for the report and in particular for providing a solution to it.
I hadn't seen this effect on any of my numa workstations so far to
this extend. So, take all the feedback and suggestions in this post
with great caution.

>
> On 17/12/14 09:25, Charlie Smurthwaite wrote:
>> Hi,
>>
>> I am running an SVN HTTP server using Apache/2.4.10 with SVN 1.8.10
>> (r1615264) and I am often seeing very high kernel CPU load.
>>
>> The CPU time seems to be consumed in the kernel by "_raw_spin_lock",
>> with the httpd processes spending much of their time waiting on calls
>> to "futex".
A futex is simply an efficient mutex implementation.
Nothing spectacular in itself for a multi-threaded server.
>>
>> Here's a syscall analysis from svn:
>> http://paste.codebasehq.com/pastes/7qzt68lx2eghz2gjns
The clue here is that the locks take ~1s to acquire where it
should be musecs msecs. This suggests that the process /
thread currently holding the futex cannot be run by the system
and finish its work. This is consistent with heavy swapping
or high kernel load.

>> Here's the kernel CPU time analysis: http://i.imgur.com/37Ryt5V.png
An interesting clue is in the second is copy_pte_range, where
PTE is short for Page Table Entry. So, the OS is busy updating
virtual mem to phys. mem translation tables. Heavy use of
mmap or possibly forking may cause this - or the OS trying
to figure out which pages to evict if memory gets tight.

Another problem may be caused by frequent forking or large
processes (probably not the case here). The page table may
grow significantly. Have a look at /proc/meminfo. E.g. the entry
for PageTables should not exceed 1/100th of your RAM.

>> Here's a snapshot from htop showing why this is a problem:
>> http://i.imgur.com/I3mDDbi.png
That one got me confused at first until I realized that each
thread is shown separately. Pressing 'H' in htop will hide the
individual threads and you see the "actual" processes and
their memory consumption.

All in all the numbers are a bit on the big side. If this server
only handles SVN requests, I would expect smaller resident
sizes unless the number of requests currently served is high.
Less than 100MB / thread would be my expectation.

Also, the virtual memory sizes are much higher than phys.
in some cases. That makes me suspect memory fragmentation
issues. On some platforms, the Apache Runtime (APR) will
use anonymous mmap to allocate memory, eliminating
some of that fragmentation and other overhead.

>>
>> I'd appreciate if anyone could tell me whether I have likely
>> configured something incorrectly, whether there is an obvious
>> workaround, or whether this needs to be escalated as a bug, and if
>> so, to whom?
>>
>> Thank you!
>>
>> Charlie
>>
>>
>> My httpd is configured as follows:
>>
>> root_at_storage02:~# /opt/subversion-server/bin/httpd -V
>> Server version: Apache/2.4.10 (Unix)
>> Server built: Nov 7 2014 15:16:58
>> Server's Module Magic Number: 20120211:36
>> Server loaded: APR 1.5.1, APR-UTIL 1.5.4
>> Compiled using: APR 1.5.1, APR-UTIL 1.5.4
>> Architecture: 64-bit
>> Server MPM: worker
>> threaded: yes (fixed thread count)
>> forked: yes (variable process count)
>> Server compiled with....
>> -D APR_HAS_SENDFILE
>> -D APR_HAS_MMAP
>> -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
>> -D APR_USE_SYSVSEM_SERIALIZE
>> -D APR_USE_PTHREAD_SERIALIZE
>> -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
>> -D APR_HAS_OTHER_CHILD
>> -D AP_HAVE_RELIABLE_PIPED_LOGS
>> -D DYNAMIC_MODULE_LIMIT=256
>> -D HTTPD_ROOT="/opt/subversion-server"
>> -D SUEXEC_BIN="/opt/subversion-server/bin/suexec"
>> -D DEFAULT_PIDLOG="logs/httpd.pid"
>> -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
>> -D DEFAULT_ERRORLOG="logs/error_log"
>> -D AP_TYPES_CONFIG_FILE="conf/mime.types"
>> -D SERVER_CONFIG_FILE="conf/httpd.conf"
Nothing suspicious here but I'm not an expert. Assuming that you
did not change SVN's default cache config, there are a few things
you might want to try.

* Limit the number of worker processes (ServerLimit) to e.g. 4.
This allows larger caches per instances and reduces fragmentation.

* Set |MaxRequestsPerChild(if set at all) to some large-ish value,
e.g. 100000.| Workers live long enough to benefit from hot caches.

* Set the per-process SVN cache to 2GB (SVNInMemoryCacheSize),
i.e. a total of 8GB.

* Set "SVNCacheFullTexts on" and "SVNCacheTextDeltas on".

These cache settings have a variety of effects. First, even a single
client doing a c/o of some cold repository benefits from caching
as there is a lot of context data (directories etc.) to keep tack of.
If there are several connections to the server, this alone makes
good use of hundreds of MB. CPU load is being reduced as data
parsers etc. don't have to be rerun again and again.

Another effect is that less data needs to be parsed and reconstructed,
i.e. there is less dynamic data allocation involved. This reduces
memory fragmentation and lowers the typical worker process
size (once you deduct the in-process cache size).

Finally, Subversion will segment larger caches. That means there
will be less contention (see futex) when many threads try to put
data into the cache.

> It was suggested that this might be related to memory management, so I
> have looked into a couple of things:
>
> 1) I disabled swap on the server in question. The host had 25GB of RAM
> free and should not have been swapping active memory, however I
> believe disabling swap has solved the problem.
> 2) The server is running NUMA, rather than SMP memory configuration. I
> suspect that this is the reason for the problem (though I have no
> evidence) and that it is not specific to SVN/Apache.
I will try to see whether I can reproduce the effect when I get back
home. Holidays may delay that a bit ...

Apache may have a particular problem with numa. I always assumed
that processes *and* memory get migrated between nodes but it
seems that memory might not - or not as long as the processes
are busy. The problem is that Apache forks all workers from a single
master process. So, they are likely to start on the same node and
allocate memory on it.

One way to circumvent this would be running 2 Apache instances,
each pegged to one core. It it the closest to how numa systems are
supposed to be used that I can think of. I don't know whether the
servers must use different ports in that case but a small proxy process
in front of them might help with that. No idea how viable that approach
is ...

>
> Thanks to Bert on Freenode for his assistance in finding this
> workaround. I suspect that changing NUMA config, or changing Apache's
> threading model would also prevent the problem.

Bert pointed me to your post and the references posted on IRC.
Apart from the things listed above, an "interleaved" numa setup
is probably your best bet because disabling swap is not a terribly
good solution on a server, IMO. Swapping gives you graceful
degradation when resources get scarce.

The penalty of accessing "remote" memory is minimal on a two-
socket system. So, changing to "interleaved" won't cost you much.
I've got svnserve to sustain >50Gb/s over localhost for hours on
my 2x4core numa machine.

-- Stefan^2.
Received on 2014-12-18 18:12:18 CET

This message: [ Message body ]
Next message: Anton Lee: "crash"
Previous message: Konstantin Kolosovsky: "When file has several conflicted properties "svn info --xml" for that file outputs same "conflict" element several times (for each conflicted property)"
In reply to: Charlie Smurthwaite: "Re: Kernel CPU load on Apache/2.4.10 with SVN 1.8.10 (r1615264)"
Next in thread: Mohsin: "Re: Kernel CPU load on Apache/2.4.10 with SVN 1.8.10 (r1615264)"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]