Hi,
On Fri, Nov 30, 2012 at 8:19 PM, Philip Martin
<philip.martin_at_wandisco.com> wrote:
> Stefan Küng <tortoisesvn_at_gmail.com> writes:
>
>> Here's how to reproduce:
>>
>> $ svn co https://tortoisesvn.googlecode.com/svn/trunk/src/Resources/tools tools
>>
>> get the file here:
>> https://skydrive.live.com/redir?resid=D000F60A347E5B37!11352
>> and replace the one in 'tools' with this one.
>
> I can reproduce locally by importing tools into a local repository,
> checking out, replacing the file and attempting the commit. That is
> using serf 1.1.x. Using serf trunk the commit goes into a loop.
>
I see the same problem in a local repository. With some extra logging
I see that one of the delta windows isn't handled correctly by the
server:
This is svn trunk with serf:
write_handler window: {sview_offset = 102400, sview_len = 102400,
tview_len = 102400, num_ops = 55, src_ops = 27, ops->action =
svn_txdelta_new, new_data = 0x15cbc28}
write_handler window: {sview_offset = 204800, sview_len = 102400,
tview_len = 102400, num_ops = 143, src_ops = 71, ops->action =
svn_txdelta_new, new_data = 0x15c0028}
write_handler window: {sview_offset = 307200, sview_len = 102400,
tview_len = 102400, num_ops = 23, src_ops = 11, ops->action =
svn_txdelta_new, new_data = 0x15be428}
write_handler window: {sview_offset = 0, sview_len = 0, tview_len =
102400, num_ops = 1, src_ops = 0, ops->action = svn_txdelta_new,
new_data = 0x17e8028}
This is svn 1.7.7 with neon:
write_handler window: {sview_offset = 102400, sview_len = 102400,
tview_len = 102400, num_ops = 55, src_ops = 27, ops->action =
svn_txdelta_new, new_data = 0x15cbc28}
write_handler window: {sview_offset = 204800, sview_len = 102400,
tview_len = 102400, num_ops = 143, src_ops = 71, ops->action =
svn_txdelta_new, new_data = 0x15c0028}
write_handler window: {sview_offset = 307200, sview_len = 102400,
tview_len = 102400, num_ops = 23, src_ops = 11, ops->action =
svn_txdelta_new, new_data = 0x15be428}
write_handler window: {sview_offset = 0, sview_len = 0, tview_len =
102400, num_ops = 1, src_ops = 0, ops->action = svn_txdelta_new,
new_data = 0x17e8028}
...
The core issue seems to be introduced in r1390435 as part of the
svndiff optimizations.
Attached patch fixes the issue for me. I don't know how it impacts
other parts of the code, so review is appreciated. The patch still
contains logging so not meant to be applied directly!
> As far as I can tell the problem is the client causing mod_dav_svn to
> SEGV (serf trunk keep retrying and causing multiple SEGVs). The
> mod_dav_svn stack trace isn't very useful, I'll need a httpd debug
> build:
>
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 0x7fe2c42e7700 (LWP 31534)]
> 0x00007fe2c98245cc in apr_brigade_cleanup () from /usr/lib/libaprutil-1.so.0
> (gdb) bt
> #0 0x00007fe2c98245cc in apr_brigade_cleanup ()
> from /usr/lib/libaprutil-1.so.0
> #1 0x00007fe2c75258bf in ?? () from /usr/lib/apache2/modules/mod_dav.so
> #2 0x00007fe2c7528960 in ?? () from /usr/lib/apache2/modules/mod_dav.so
> #3 0x00007fe2c9ee51f0 in ap_run_handler ()
> #4 0x00007fe2c9ee563b in ap_invoke_handler ()
> #5 0x00007fe2c9ef5448 in ap_process_request ()
> #6 0x00007fe2c9ef2308 in ?? ()
> #7 0x00007fe2c9eebbb0 in ap_run_process_connection ()
> #8 0x00007fe2c9efb55d in ?? ()
> #9 0x00007fe2c960f597 in ?? () from /usr/lib/libapr-1.so.0
> #10 0x00007fe2c93cbb50 in start_thread (arg=<optimized out>)
> at pthread_create.c:304
> #11 0x00007fe2c9115a7d in clone ()
> at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
> #12 0x0000000000000000 in ?? ()
>
> I'd guess it's memory corruption in the server.
Well, besides the client seemingly sending incorrect svndiff windows,
the server should not crash. I got the following stack trace from
httpd in the debugger:
Out of memory - terminating application.
Program received signal SIGABRT, Aborted.
0x00007fff88cd7ce2 in __pthread_kill ()
(gdb) bt
#0 0x00007fff88cd7ce2 in __pthread_kill ()
#1 0x00007fff8381f7d2 in pthread_kill ()
#2 0x00007fff83810a7a in abort ()
#3 0x00000001011ef651 in abort_on_pool_failure (retcode=12) at pool.c:55
#4 0x000000010030e290 in apr_palloc ()
#5 0x00000001012067c7 in svn_stringbuf_create_ensure
(blocksize=12804161111182623672, pool=0x100a72428) at string.c:329
#6 0x0000000101206867 in svn_stringbuf_ncreate (bytes=0x1017dd035
"??", size=12804161111182623667, pool=0x100a72428)
at string.c:346
#7 0x0000000101199dbe in write_handler (baton=0x100a048b8,
buffer=0x1009bfc48 "????á$8\001", len=0x7fff5fbff2d8) at svndiff.c:886
#8 0x00000001012011fa in svn_stream_write (stream=0x100a04900,
data=0x1009bfc48 "????á$8\001", len=0x7fff5fbff2d8) at stream.c:162
#9 0x000000010102d30f in write_stream (stream=0x1009c8ba8,
buf=0x1009bfc48, bufsize=2048) at repos.c:2892
#10 0x00000001007969d4 in dav_handler ()
#11 0x0000000100001cd6 in ap_invoke_handler ()
#12 0x0000000100021433 in ap_process_request ()
#13 0x000000010001eb50 in ap_process_http_connection ()
#14 0x000000010000da28 in ap_process_connection ()
#15 0x0000000100027219 in child_main ()
#16 0x000000010002696a in make_child ()
#17 0x000000010002600b in ap_mpm_run ()
#18 0x0000000100007139 in main ()
(gdb) frame 7
#7 0x0000000101199dbe in write_handler (baton=0x100a048b8,
buffer=0x1009bfc48 "????á$8\001", len=0x7fff5fbff2d8) at svndiff.c:886
886 db->buffer =
(gdb) p *len
$1 = 2048
(gdb) p remaining
$2 = 12804161111182623667
..
(gdb) p db->buffer->data
$5 = 0xe4d8c0d9ec42b70f <Address 0xe4d8c0d9ec42b70f out of bounds>
Looks like the db->buffer struct is overwritten with data, thereby
invalidating the db->buffer->data pointer.
A third issue is that serf is either segfaulting or retrying when the
server aborts the connection due to this segfault. I'll look into this
further.
Lieven
Received on 2012-12-02 14:59:01 CET