On Sun, Dec 2, 2012 at 2:57 PM, Lieven Govaerts <svnlgo_at_mobsol.be> wrote:
> Hi,
>
> On Fri, Nov 30, 2012 at 8:19 PM, Philip Martin
> <philip.martin_at_wandisco.com> wrote:
>> Stefan Küng <tortoisesvn_at_gmail.com> writes:
>>
>>> Here's how to reproduce:
>>>
>>> $ svn co https://tortoisesvn.googlecode.com/svn/trunk/src/Resources/tools tools
>>>
>>> get the file here:
>>> https://skydrive.live.com/redir?resid=D000F60A347E5B37!11352
>>> and replace the one in 'tools' with this one.
>>
>> I can reproduce locally by importing tools into a local repository,
>> checking out, replacing the file and attempting the commit. That is
>> using serf 1.1.x. Using serf trunk the commit goes into a loop.
>>
>
> I see the same problem in a local repository. With some extra logging
> I see that one of the delta windows isn't handled correctly by the
> server:
>
> This is svn trunk with serf:
> write_handler window: {sview_offset = 102400, sview_len = 102400,
> tview_len = 102400, num_ops = 55, src_ops = 27, ops->action =
> svn_txdelta_new, new_data = 0x15cbc28}
> write_handler window: {sview_offset = 204800, sview_len = 102400,
> tview_len = 102400, num_ops = 143, src_ops = 71, ops->action =
> svn_txdelta_new, new_data = 0x15c0028}
> write_handler window: {sview_offset = 307200, sview_len = 102400,
> tview_len = 102400, num_ops = 23, src_ops = 11, ops->action =
> svn_txdelta_new, new_data = 0x15be428}
> write_handler window: {sview_offset = 0, sview_len = 0, tview_len =
> 102400, num_ops = 1, src_ops = 0, ops->action = svn_txdelta_new,
> new_data = 0x17e8028}
>
> This is svn 1.7.7 with neon:
> write_handler window: {sview_offset = 102400, sview_len = 102400,
> tview_len = 102400, num_ops = 55, src_ops = 27, ops->action =
> svn_txdelta_new, new_data = 0x15cbc28}
> write_handler window: {sview_offset = 204800, sview_len = 102400,
> tview_len = 102400, num_ops = 143, src_ops = 71, ops->action =
> svn_txdelta_new, new_data = 0x15c0028}
> write_handler window: {sview_offset = 307200, sview_len = 102400,
> tview_len = 102400, num_ops = 23, src_ops = 11, ops->action =
> svn_txdelta_new, new_data = 0x15be428}
> write_handler window: {sview_offset = 0, sview_len = 0, tview_len =
> 102400, num_ops = 1, src_ops = 0, ops->action = svn_txdelta_new,
> new_data = 0x17e8028}
> ...
Copy-paste error, this is the log content with neon:
write_handler window: {sview_offset = 0, sview_len = 102400, tview_len
= 102400, num_ops = 117, src_ops = 58, ops->action = svn_txdelta_new,
new_data = 0x3461028}
write_handler window: {sview_offset = 102400, sview_len = 102400,
tview_len = 102400, num_ops = 55, src_ops = 27, ops->action =
svn_txdelta_new, new_data = 0x34a9828}
write_handler window: {sview_offset = 204800, sview_len = 102400,
tview_len = 102400, num_ops = 143, src_ops = 71, ops->action =
svn_txdelta_new, new_data = 0x3461028}
write_handler window: {sview_offset = 307200, sview_len = 102400,
tview_len = 102400, num_ops = 23, src_ops = 11, ops->action =
svn_txdelta_new, new_data = 0x345aa28}
write_handler window: {sview_offset = 409600, sview_len = 102400,
tview_len = 102400, num_ops = 1, src_ops = 0, ops->action =
svn_txdelta_new, new_data = 0x3460028}
write_handler window: {sview_offset = 512000, sview_len = 102400,
tview_len = 102400, num_ops = 117, src_ops = 59, ops->action =
svn_txdelta_new, new_data = 0x345da28}
write_handler window: {sview_offset = 614400, sview_len = 102400,
tview_len = 102400, num_ops = 13, src_ops = 6, ops->action =
svn_txdelta_new, new_data = 0x36e1028}
...
>
> The core issue seems to be introduced in r1390435 as part of the
> svndiff optimizations.
>
> Attached patch fixes the issue for me. I don't know how it impacts
> other parts of the code, so review is appreciated. The patch still
> contains logging so not meant to be applied directly!
>
>> As far as I can tell the problem is the client causing mod_dav_svn to
>> SEGV (serf trunk keep retrying and causing multiple SEGVs). The
>> mod_dav_svn stack trace isn't very useful, I'll need a httpd debug
>> build:
>>
>> Program received signal SIGSEGV, Segmentation fault.
>> [Switching to Thread 0x7fe2c42e7700 (LWP 31534)]
>> 0x00007fe2c98245cc in apr_brigade_cleanup () from /usr/lib/libaprutil-1.so.0
>> (gdb) bt
>> #0 0x00007fe2c98245cc in apr_brigade_cleanup ()
>> from /usr/lib/libaprutil-1.so.0
>> #1 0x00007fe2c75258bf in ?? () from /usr/lib/apache2/modules/mod_dav.so
>> #2 0x00007fe2c7528960 in ?? () from /usr/lib/apache2/modules/mod_dav.so
>> #3 0x00007fe2c9ee51f0 in ap_run_handler ()
>> #4 0x00007fe2c9ee563b in ap_invoke_handler ()
>> #5 0x00007fe2c9ef5448 in ap_process_request ()
>> #6 0x00007fe2c9ef2308 in ?? ()
>> #7 0x00007fe2c9eebbb0 in ap_run_process_connection ()
>> #8 0x00007fe2c9efb55d in ?? ()
>> #9 0x00007fe2c960f597 in ?? () from /usr/lib/libapr-1.so.0
>> #10 0x00007fe2c93cbb50 in start_thread (arg=<optimized out>)
>> at pthread_create.c:304
>> #11 0x00007fe2c9115a7d in clone ()
>> at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
>> #12 0x0000000000000000 in ?? ()
>>
>> I'd guess it's memory corruption in the server.
>
> Well, besides the client seemingly sending incorrect svndiff windows,
> the server should not crash. I got the following stack trace from
> httpd in the debugger:
>
> Out of memory - terminating application.
>
> Program received signal SIGABRT, Aborted.
> 0x00007fff88cd7ce2 in __pthread_kill ()
> (gdb) bt
> #0 0x00007fff88cd7ce2 in __pthread_kill ()
> #1 0x00007fff8381f7d2 in pthread_kill ()
> #2 0x00007fff83810a7a in abort ()
> #3 0x00000001011ef651 in abort_on_pool_failure (retcode=12) at pool.c:55
> #4 0x000000010030e290 in apr_palloc ()
> #5 0x00000001012067c7 in svn_stringbuf_create_ensure
> (blocksize=12804161111182623672, pool=0x100a72428) at string.c:329
> #6 0x0000000101206867 in svn_stringbuf_ncreate (bytes=0x1017dd035
> "??", size=12804161111182623667, pool=0x100a72428)
> at string.c:346
> #7 0x0000000101199dbe in write_handler (baton=0x100a048b8,
> buffer=0x1009bfc48 "????ل$8\001", len=0x7fff5fbff2d8) at svndiff.c:886
> #8 0x00000001012011fa in svn_stream_write (stream=0x100a04900,
> data=0x1009bfc48 "????ل$8\001", len=0x7fff5fbff2d8) at stream.c:162
> #9 0x000000010102d30f in write_stream (stream=0x1009c8ba8,
> buf=0x1009bfc48, bufsize=2048) at repos.c:2892
> #10 0x00000001007969d4 in dav_handler ()
> #11 0x0000000100001cd6 in ap_invoke_handler ()
> #12 0x0000000100021433 in ap_process_request ()
> #13 0x000000010001eb50 in ap_process_http_connection ()
> #14 0x000000010000da28 in ap_process_connection ()
> #15 0x0000000100027219 in child_main ()
> #16 0x000000010002696a in make_child ()
> #17 0x000000010002600b in ap_mpm_run ()
> #18 0x0000000100007139 in main ()
> (gdb) frame 7
> #7 0x0000000101199dbe in write_handler (baton=0x100a048b8,
> buffer=0x1009bfc48 "????ل$8\001", len=0x7fff5fbff2d8) at svndiff.c:886
> 886 db->buffer =
> (gdb) p *len
> $1 = 2048
> (gdb) p remaining
> $2 = 12804161111182623667
> ..
> (gdb) p db->buffer->data
> $5 = 0xe4d8c0d9ec42b70f <Address 0xe4d8c0d9ec42b70f out of bounds>
>
> Looks like the db->buffer struct is overwritten with data, thereby
> invalidating the db->buffer->data pointer.
>
>
> A third issue is that serf is either segfaulting or retrying when the
> server aborts the connection due to this segfault. I'll look into this
> further.
>
> Lieven
Received on 2012-12-02 15:01:32 CET