Holy cow. Justin, isn't this an ancient bug that we've never figured out?
On Thu, Aug 13, 2009 at 1:59 PM, Blaise Tarr<blaise.tarr_at_gmail.com> wrote:
> There is a reproducible and very serious memory leak when using
> mod_deflate with mod_dav_svn that can cause all virtual memory on a
> machine to be consumed, leading to its crash. To reproduce with
> Subversion 1.6.4 and Apache 2.2.13 using worker MPM on 64 bit RHEL4:
>
> * register mod_deflate as an output filter, and make sure there are no
> other output filters before it (add "SetOutputFilter DEFLATE" to the
> repository's <Location> block)
>
> * create a 500 MB file, then add and commit it to the repo:
> $ dd if=/dev/urandom of=bigfile bs=5M count=100
> $ svn add bigfile
> $ svn ci bigfile
>
> * retrieve the file with a user-agent that does not supply the
> "Accept-Encoding" header, such as wget:
> $ wget http://localhost:8022/repo/trunk/bigfile
>
> * watch the httpd process's VSZ keep rising, hitting 3 GB when about
> 300 MB of the big file has been downloaded
>
> Prior to the wget I issued "ulimit -v $((3 * 1024 * 1024))" so the
> httpd child process died once its VSZ reached 3 GB, producing an
> equally sizable core file with the following stack trace:
>
> #0 memcpy () from /lib64/tls/libc.so.6
> #1 apr_pstrcat (a=0x2b34487038) at strings/apr_strings.c:165
> #2 apr_table_mergen (t=0x7099b8, key=0x2a971b3cb6 "Vary",
> val=0x2a971b3cbb "Accept-Encoding") at tables/apr_tables.c:763
> #3 deflate_out_filter (f=0x70b0a8, bb=0x2b344d52c8)
> at mod_deflate.c:485
> #4 deliver (resource=0x70b158, output=0x70b0a8)
> at subversion/mod_dav_svn/repos.c:3159
> #5 dav_handler (r=0x708fd8) at mod_dav.c:851
> #6 ap_run_handler (r=0x708fd8) at config.c:157
> #7 ap_invoke_handler (r=0x708fd8) at config.c:372
> #8 ap_process_request (r=0x708fd8) at http_request.c:282
> #9 ap_process_http_connection (c=0x6fb190) at http_core.c:190
> #10 ap_run_process_connection (c=0x6fb190) at connection.c:43
> #11 worker_thread (thd=0x6b4bd8, dummy=0x0) at worker.c:544
> #12 start_thread () from /lib64/tls/libpthread.so.0
> #13 clone () from /lib64/tls/libc.so.6
> #14 ?? ()
>
> A few observations:
>
> (gdb) dump_filters r->output_filters
> content_length(0x70a4e8): ctx=0x749688, r=0x708fd8, c=0x6fb190
> http_outerror(0x70a538): ctx=0x749970, r=0x708fd8, c=0x6fb190
> log_input_output(0x6fb970): ctx=0x0, r=0x0, c=0x6fb190
> core(0x6fb9e0): ctx=0x6fb998, r=0x0, c=0x6fb190
> (gdb)
>
> Notice that deflate is no longer in the output filter chain, yet it
> was called. Now look at the output headers:
>
> (gdb) dump_table r->headers_out
> [0] 'Last-Modified'='Thu, 13 Aug 2009 16:22:14 GMT'
> [1] 'ETag'='"2//trunk/bigfile"'
> [2] 'Accept-Ranges'='bytes'
> [3] 'Content-Length'='524288000'
> [4] 'Vary'='Accept-Encoding, Accept-Encoding, Accept-Encoding,
> Accept-Encoding, Accept-Encoding, Accept-Encoding, Accept-Encoding,
> Accept-Encoding, Accept-Encoding, Accept-Encoding, Accept-Encoding,
> Accept-Encoding, Accept-Encoding, ...
>
> There are actually 18,832 "Accept-Encoding" entries in the "Vary"
> header. While that's a lot, that only accounts for about 300 KB.
>
> This is what happens -- when mod_dav_svn is generating the response
> and passes the bucket brigade on to the output filter chain, it
> doesn't consider that the first output filter could have removed
> itself from the filter chain, as is the case with mod_deflate when the
> user-agent doesn't indicate that it can accept compressed data.
> Recall that the filter chain is just a pointer to the first element in
> a linked list, and after the removal of the first element, the
> original filter chain pointer will point to the removed filter, not
> the actual filter chain. This can result in deflate_out_filter()
> being called with a NULL ctx every time mod_dav_svn calls
> ap_pass_brigade(). Since each time deflate_out_filter() thinks it's
> being called for the first time, it allocates various data structures,
> and calls apr_table_mergen() to append "Accept-Encoding" to the "Vary"
> output header, which in turn uses apr_pstrcat(). This operation causes
> memory to be allocated at a geometric rate relative to the number of
> times deflate_out_filter() is called. When retrieving very large files
> (multi-hundred megabytes), this cycle happens thousands of times,
> which leads to the crash of the server as the memory allocated reaches
> the ulimit, or all the VM on the machine is exhausted.
>
> Attached is a patch against 1.6.4 (also applies against the 1.6.x
> branch) that ensures that the correct output filter chain is passed to
> ap_pass_brigade().
>
> Thanks.
>
> --
> Blaise Tarr
>
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383415
> --- subversion-1.6.4/subversion/mod_dav_svn/repos.c.orig 2009-02-16 12:35:25.000000000 -0800
> +++ subversion-1.6.4/subversion/mod_dav_svn/repos.c 2009-08-13 07:16:27.000000000 -0700
> @@ -3023,7 +3023,8 @@
>
> bkt = apr_bucket_eos_create(output->c->bucket_alloc);
> APR_BRIGADE_INSERT_TAIL(bb, bkt);
> - if ((status = ap_pass_brigade(output, bb)) != APR_SUCCESS)
> + if ((status = ap_pass_brigade(resource->info->r->output_filters, bb)) !=
> + APR_SUCCESS)
> return dav_new_error(resource->pool, HTTP_INTERNAL_SERVER_ERROR, 0,
> "Could not write EOS to filter.");
>
> @@ -3087,7 +3088,7 @@
>
> /* create a stream that svndiff data will be written to,
> which will copy it to the network */
> - dc.output = output;
> + dc.output = resource->info->r->output_filters;
> dc.pool = resource->pool;
> o_stream = svn_stream_create(&dc, resource->pool);
> svn_stream_set_write(o_stream, write_to_filter);
> @@ -3156,7 +3157,8 @@
> bkt = apr_bucket_transient_create(block, bufsize,
> output->c->bucket_alloc);
> APR_BRIGADE_INSERT_TAIL(bb, bkt);
> - if ((status = ap_pass_brigade(output, bb)) != APR_SUCCESS) {
> + if ((status = ap_pass_brigade(resource->info->r->output_filters, bb)) !=
> + APR_SUCCESS) {
> /* ### what to do with status; and that HTTP code... */
> return dav_new_error(resource->pool, HTTP_INTERNAL_SERVER_ERROR, 0,
> "Could not write data to filter.");
> @@ -3167,7 +3169,8 @@
> bb = apr_brigade_create(resource->pool, output->c->bucket_alloc);
> bkt = apr_bucket_eos_create(output->c->bucket_alloc);
> APR_BRIGADE_INSERT_TAIL(bb, bkt);
> - if ((status = ap_pass_brigade(output, bb)) != APR_SUCCESS) {
> + if ((status = ap_pass_brigade(resource->info->r->output_filters, bb)) !=
> + APR_SUCCESS) {
> /* ### what to do with status; and that HTTP code... */
> return dav_new_error(resource->pool, HTTP_INTERNAL_SERVER_ERROR, 0,
> "Could not write EOS to filter.");
>
>
------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383440
Received on 2009-08-13 23:06:03 CEST