[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

immense Apache memory leak, plus patch

From: Blaise Tarr <blaise.tarr_at_gmail.com>
Date: Thu, 13 Aug 2009 14:59:54 -0400

There is a reproducible and very serious memory leak when using
mod_deflate with mod_dav_svn that can cause all virtual memory on a
machine to be consumed, leading to its crash. To reproduce with
Subversion 1.6.4 and Apache 2.2.13 using worker MPM on 64 bit RHEL4:

* register mod_deflate as an output filter, and make sure there are no
  other output filters before it (add "SetOutputFilter DEFLATE" to the
  repository's <Location> block)

* create a 500 MB file, then add and commit it to the repo:
  $ dd if=/dev/urandom of=bigfile bs=5M count=100
  $ svn add bigfile
  $ svn ci bigfile

* retrieve the file with a user-agent that does not supply the
  "Accept-Encoding" header, such as wget:
  $ wget http://localhost:8022/repo/trunk/bigfile

* watch the httpd process's VSZ keep rising, hitting 3 GB when about
  300 MB of the big file has been downloaded

Prior to the wget I issued "ulimit -v $((3 * 1024 * 1024))" so the
httpd child process died once its VSZ reached 3 GB, producing an
equally sizable core file with the following stack trace:

#0 memcpy () from /lib64/tls/libc.so.6
#1 apr_pstrcat (a=0x2b34487038) at strings/apr_strings.c:165
#2 apr_table_mergen (t=0x7099b8, key=0x2a971b3cb6 "Vary",
      val=0x2a971b3cbb "Accept-Encoding") at tables/apr_tables.c:763
#3 deflate_out_filter (f=0x70b0a8, bb=0x2b344d52c8)
      at mod_deflate.c:485
#4 deliver (resource=0x70b158, output=0x70b0a8)
      at subversion/mod_dav_svn/repos.c:3159
#5 dav_handler (r=0x708fd8) at mod_dav.c:851
#6 ap_run_handler (r=0x708fd8) at config.c:157
#7 ap_invoke_handler (r=0x708fd8) at config.c:372
#8 ap_process_request (r=0x708fd8) at http_request.c:282
#9 ap_process_http_connection (c=0x6fb190) at http_core.c:190
#10 ap_run_process_connection (c=0x6fb190) at connection.c:43
#11 worker_thread (thd=0x6b4bd8, dummy=0x0) at worker.c:544
#12 start_thread () from /lib64/tls/libpthread.so.0
#13 clone () from /lib64/tls/libc.so.6
#14 ?? ()

A few observations:

(gdb) dump_filters r->output_filters
content_length(0x70a4e8): ctx=0x749688, r=0x708fd8, c=0x6fb190
http_outerror(0x70a538): ctx=0x749970, r=0x708fd8, c=0x6fb190
log_input_output(0x6fb970): ctx=0x0, r=0x0, c=0x6fb190
core(0x6fb9e0): ctx=0x6fb998, r=0x0, c=0x6fb190
(gdb)

Notice that deflate is no longer in the output filter chain, yet it
was called. Now look at the output headers:

(gdb) dump_table r->headers_out
[0] 'Last-Modified'='Thu, 13 Aug 2009 16:22:14 GMT'
[1] 'ETag'='"2//trunk/bigfile"'
[2] 'Accept-Ranges'='bytes'
[3] 'Content-Length'='524288000'
[4] 'Vary'='Accept-Encoding, Accept-Encoding, Accept-Encoding,
Accept-Encoding, Accept-Encoding, Accept-Encoding, Accept-Encoding,
Accept-Encoding, Accept-Encoding, Accept-Encoding, Accept-Encoding,
Accept-Encoding, Accept-Encoding, ...

There are actually 18,832 "Accept-Encoding" entries in the "Vary"
header. While that's a lot, that only accounts for about 300 KB.

This is what happens -- when mod_dav_svn is generating the response
and passes the bucket brigade on to the output filter chain, it
doesn't consider that the first output filter could have removed
itself from the filter chain, as is the case with mod_deflate when the
user-agent doesn't indicate that it can accept compressed data.
Recall that the filter chain is just a pointer to the first element in
a linked list, and after the removal of the first element, the
original filter chain pointer will point to the removed filter, not
the actual filter chain. This can result in deflate_out_filter()
being called with a NULL ctx every time mod_dav_svn calls
ap_pass_brigade(). Since each time deflate_out_filter() thinks it's
being called for the first time, it allocates various data structures,
and calls apr_table_mergen() to append "Accept-Encoding" to the "Vary"
output header, which in turn uses apr_pstrcat(). This operation causes
memory to be allocated at a geometric rate relative to the number of
times deflate_out_filter() is called. When retrieving very large files
(multi-hundred megabytes), this cycle happens thousands of times,
which leads to the crash of the server as the memory allocated reaches
the ulimit, or all the VM on the machine is exhausted.

Attached is a patch against 1.6.4 (also applies against the 1.6.x
branch) that ensures that the correct output filter chain is passed to
ap_pass_brigade().

Thanks.

-- 
Blaise Tarr
------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383415

--- subversion-1.6.4/subversion/mod_dav_svn/repos.c.orig 2009-02-16 12:35:25.000000000 -0800
+++ subversion-1.6.4/subversion/mod_dav_svn/repos.c 2009-08-13 07:16:27.000000000 -0700
@@ -3023,7 +3023,8 @@
 
       bkt = apr_bucket_eos_create(output->c->bucket_alloc);
       APR_BRIGADE_INSERT_TAIL(bb, bkt);
- if ((status = ap_pass_brigade(output, bb)) != APR_SUCCESS)
+ if ((status = ap_pass_brigade(resource->info->r->output_filters, bb)) !=
+ APR_SUCCESS)
         return dav_new_error(resource->pool, HTTP_INTERNAL_SERVER_ERROR, 0,
                              "Could not write EOS to filter.");
 
@@ -3087,7 +3088,7 @@
 
           /* create a stream that svndiff data will be written to,
              which will copy it to the network */
- dc.output = output;
+ dc.output = resource->info->r->output_filters;
           dc.pool = resource->pool;
           o_stream = svn_stream_create(&dc, resource->pool);
           svn_stream_set_write(o_stream, write_to_filter);
@@ -3156,7 +3157,8 @@
         bkt = apr_bucket_transient_create(block, bufsize,
                                           output->c->bucket_alloc);
         APR_BRIGADE_INSERT_TAIL(bb, bkt);
- if ((status = ap_pass_brigade(output, bb)) != APR_SUCCESS) {
+ if ((status = ap_pass_brigade(resource->info->r->output_filters, bb)) !=
+ APR_SUCCESS) {
           /* ### what to do with status; and that HTTP code... */
           return dav_new_error(resource->pool, HTTP_INTERNAL_SERVER_ERROR, 0,
                                "Could not write data to filter.");
@@ -3167,7 +3169,8 @@
       bb = apr_brigade_create(resource->pool, output->c->bucket_alloc);
       bkt = apr_bucket_eos_create(output->c->bucket_alloc);
       APR_BRIGADE_INSERT_TAIL(bb, bkt);
- if ((status = ap_pass_brigade(output, bb)) != APR_SUCCESS) {
+ if ((status = ap_pass_brigade(resource->info->r->output_filters, bb)) !=
+ APR_SUCCESS) {
         /* ### what to do with status; and that HTTP code... */
         return dav_new_error(resource->pool, HTTP_INTERNAL_SERVER_ERROR, 0,
                              "Could not write EOS to filter.");
Received on 2009-08-13 22:56:21 CEST

This is an archived mail posted to the Subversion Dev mailing list.