[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: immense Apache memory leak, plus patch

From: Greg Stein <gstein_at_gmail.com>
Date: Fri, 14 Aug 2009 12:17:43 +0200

It is *really* old. The output filter is passed as a parameter to the
mod_dav repository plugin (see deliver()). So it's been around since
about 1999 or 2000 -- before Subversion was even started.

I don't see how this would work, however, because there are *still* a
lot of uses of the 'output' variable. Just changing the one in
ap_pass_brigade() is going to be insufficient. Seems that all of them
should really be changed.

From a design standpoint, it would seem that a filter should never
attempt to disengage itself from the filter chain. Instead, it should
just pass the brigade along.

Cheers,
-g

On Thu, Aug 13, 2009 at 23:05, Ben Collins-Sussman<sussman_at_red-bean.com> wrote:
> Holy cow.  Justin, isn't this an ancient bug that we've never figured out?
>
> On Thu, Aug 13, 2009 at 1:59 PM, Blaise Tarr<blaise.tarr_at_gmail.com> wrote:
>> There is a reproducible and very serious memory leak when using
>> mod_deflate with mod_dav_svn that can cause all virtual memory on a
>> machine to be consumed, leading to its crash. To reproduce with
>> Subversion 1.6.4 and Apache 2.2.13 using worker MPM on 64 bit RHEL4:
>>
>> * register mod_deflate as an output filter, and make sure there are no
>>  other output filters before it (add "SetOutputFilter DEFLATE" to the
>>  repository's <Location> block)
>>
>> * create a 500 MB file, then add and commit it to the repo:
>>  $ dd if=/dev/urandom of=bigfile bs=5M count=100
>>  $ svn add bigfile
>>  $ svn ci bigfile
>>
>> * retrieve the file with a user-agent that does not supply the
>>  "Accept-Encoding" header, such as wget:
>>  $ wget http://localhost:8022/repo/trunk/bigfile
>>
>> * watch the httpd process's VSZ keep rising, hitting 3 GB when about
>>  300 MB of the big file has been downloaded
>>
>> Prior to the wget I issued "ulimit -v $((3 * 1024 * 1024))" so the
>> httpd child process died once its VSZ reached 3 GB, producing an
>> equally sizable core file with the following stack trace:
>>
>> #0  memcpy () from /lib64/tls/libc.so.6
>> #1  apr_pstrcat (a=0x2b34487038) at strings/apr_strings.c:165
>> #2  apr_table_mergen (t=0x7099b8, key=0x2a971b3cb6 "Vary",
>>      val=0x2a971b3cbb "Accept-Encoding") at tables/apr_tables.c:763
>> #3  deflate_out_filter (f=0x70b0a8, bb=0x2b344d52c8)
>>      at mod_deflate.c:485
>> #4  deliver (resource=0x70b158, output=0x70b0a8)
>>      at subversion/mod_dav_svn/repos.c:3159
>> #5  dav_handler (r=0x708fd8) at mod_dav.c:851
>> #6  ap_run_handler (r=0x708fd8) at config.c:157
>> #7  ap_invoke_handler (r=0x708fd8) at config.c:372
>> #8  ap_process_request (r=0x708fd8) at http_request.c:282
>> #9  ap_process_http_connection (c=0x6fb190) at http_core.c:190
>> #10 ap_run_process_connection (c=0x6fb190) at connection.c:43
>> #11 worker_thread (thd=0x6b4bd8, dummy=0x0) at worker.c:544
>> #12 start_thread () from /lib64/tls/libpthread.so.0
>> #13 clone () from /lib64/tls/libc.so.6
>> #14 ?? ()
>>
>> A few observations:
>>
>> (gdb) dump_filters r->output_filters
>> content_length(0x70a4e8): ctx=0x749688, r=0x708fd8, c=0x6fb190
>> http_outerror(0x70a538): ctx=0x749970, r=0x708fd8, c=0x6fb190
>> log_input_output(0x6fb970): ctx=0x0, r=0x0, c=0x6fb190
>> core(0x6fb9e0): ctx=0x6fb998, r=0x0, c=0x6fb190
>> (gdb)
>>
>> Notice that deflate is no longer in the output filter chain, yet it
>> was called. Now look at the output headers:
>>
>> (gdb) dump_table r->headers_out
>> [0] 'Last-Modified'='Thu, 13 Aug 2009 16:22:14 GMT'
>> [1] 'ETag'='"2//trunk/bigfile"'
>> [2] 'Accept-Ranges'='bytes'
>> [3] 'Content-Length'='524288000'
>> [4] 'Vary'='Accept-Encoding, Accept-Encoding, Accept-Encoding,
>> Accept-Encoding, Accept-Encoding, Accept-Encoding, Accept-Encoding,
>> Accept-Encoding, Accept-Encoding, Accept-Encoding, Accept-Encoding,
>> Accept-Encoding, Accept-Encoding, ...
>>
>> There are actually 18,832 "Accept-Encoding" entries in the "Vary"
>> header. While that's a lot, that only accounts for about 300 KB.
>>
>> This is what happens -- when mod_dav_svn is generating the response
>> and passes the bucket brigade on to the output filter chain, it
>> doesn't consider that the first output filter could have removed
>> itself from the filter chain, as is the case with mod_deflate when the
>> user-agent doesn't indicate that it can accept compressed data.
>> Recall that the filter chain is just a pointer to the first element in
>> a linked list, and after the removal of the first element, the
>> original filter chain pointer will point to the removed filter, not
>> the actual filter chain.  This can result in deflate_out_filter()
>> being called with a NULL ctx every time mod_dav_svn calls
>> ap_pass_brigade(). Since each time deflate_out_filter() thinks it's
>> being called for the first time, it allocates various data structures,
>> and calls apr_table_mergen() to append "Accept-Encoding" to the "Vary"
>> output header, which in turn uses apr_pstrcat(). This operation causes
>> memory to be allocated at a geometric rate relative to the number of
>> times deflate_out_filter() is called. When retrieving very large files
>> (multi-hundred megabytes), this cycle happens thousands of times,
>> which leads to the crash of the server as the memory allocated reaches
>> the ulimit, or all the VM on the machine is exhausted.
>>
>> Attached is a patch against 1.6.4 (also applies against the 1.6.x
>> branch) that ensures that the correct output filter chain is passed to
>> ap_pass_brigade().
>>
>> Thanks.
>>
>> --
>> Blaise Tarr
>>
>> ------------------------------------------------------
>> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383415
>> --- subversion-1.6.4/subversion/mod_dav_svn/repos.c.orig        2009-02-16 12:35:25.000000000 -0800
>> +++ subversion-1.6.4/subversion/mod_dav_svn/repos.c     2009-08-13 07:16:27.000000000 -0700
>> @@ -3023,7 +3023,8 @@
>>
>>       bkt = apr_bucket_eos_create(output->c->bucket_alloc);
>>       APR_BRIGADE_INSERT_TAIL(bb, bkt);
>> -      if ((status = ap_pass_brigade(output, bb)) != APR_SUCCESS)
>> +      if ((status = ap_pass_brigade(resource->info->r->output_filters, bb)) !=
>> +          APR_SUCCESS)
>>         return dav_new_error(resource->pool, HTTP_INTERNAL_SERVER_ERROR, 0,
>>                              "Could not write EOS to filter.");
>>
>> @@ -3087,7 +3088,7 @@
>>
>>           /* create a stream that svndiff data will be written to,
>>              which will copy it to the network */
>> -          dc.output = output;
>> +          dc.output = resource->info->r->output_filters;
>>           dc.pool = resource->pool;
>>           o_stream = svn_stream_create(&dc, resource->pool);
>>           svn_stream_set_write(o_stream, write_to_filter);
>> @@ -3156,7 +3157,8 @@
>>         bkt = apr_bucket_transient_create(block, bufsize,
>>                                           output->c->bucket_alloc);
>>         APR_BRIGADE_INSERT_TAIL(bb, bkt);
>> -        if ((status = ap_pass_brigade(output, bb)) != APR_SUCCESS) {
>> +        if ((status = ap_pass_brigade(resource->info->r->output_filters, bb)) !=
>> +            APR_SUCCESS) {
>>           /* ### what to do with status; and that HTTP code... */
>>           return dav_new_error(resource->pool, HTTP_INTERNAL_SERVER_ERROR, 0,
>>                                "Could not write data to filter.");
>> @@ -3167,7 +3169,8 @@
>>       bb = apr_brigade_create(resource->pool, output->c->bucket_alloc);
>>       bkt = apr_bucket_eos_create(output->c->bucket_alloc);
>>       APR_BRIGADE_INSERT_TAIL(bb, bkt);
>> -      if ((status = ap_pass_brigade(output, bb)) != APR_SUCCESS) {
>> +      if ((status = ap_pass_brigade(resource->info->r->output_filters, bb)) !=
>> +          APR_SUCCESS) {
>>         /* ### what to do with status; and that HTTP code... */
>>         return dav_new_error(resource->pool, HTTP_INTERNAL_SERVER_ERROR, 0,
>>                              "Could not write EOS to filter.");
>>
>>
>
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383440
>

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383592
Received on 2009-08-14 12:18:00 CEST

This is an archived mail posted to the Subversion Dev mailing list.