[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Issue #2580 revisited: Windows unclean TCP close [SEVERE]

From: Stefan Sperling <stsp_at_elego.de>
Date: Fri, 16 Oct 2009 14:31:20 +0100

On Thu, Oct 15, 2009 at 09:38:23AM -0700, Bob Denny wrote:
> As of SVN 1.6.5, the svn client forcibly kills
> the tunnel subprocess as soon as it receives the last of the data that
> it expects.

This is not entirely correct. We have been killing the ssh client
for a long time.

What's new in 1.6.5 is that we now send SIGTERM instead of SIGKILL,
which allows an SSH process to clean up a master socket it might be
managing for for SSH connection pooling.

But it's *not* a new problem in 1.6.5.

> This prevents the tunnel proc from completing its
> conversation with the remote sshd, leaving it and its child svnserve
> haning there for up to a half hour. In my case the tunnel is
> TortoisePLink, but it also happens with the OpenSSH 'ssh.exe' tunnel,
> for the same reason.
>
> Specifically, in libsvn_ra_svn\client.c, a call is made
>
> apr_pool_note_subprocess(pool, proc, APR_KILL_ONLY_ONCE);
>
> Inspection of the apr sources reveals that, on Windows, the only
> significant kill modes are APR_KILL_ALWAYS and APR_KILL_NEVER. Any
> other modes (e.g. APR_KILL_ONCE) are translated on Windows to
> APR_KILL_ALWAYS, resulting in the svn client doing an immediate
> TerminateProcess() on the tunnel, preventing it from cleanly closing
> the TCP connection.
>
> Here is the tail of a packet trace showing the unclean close:
>
> 43 14.439453 svn.dc3.com bob.dc3.com TCP:Flags=...AP..., SrcPort=7822 ...
> 44 14.439453 bob.dc3.com svn.dc3.com TCP:Flags=...AP..., SrcPort=4206 ...
> 45 14.529297 svn.dc3.com bob.dc3.com TCP:Flags=...AP..., SrcPort=7822 ...
> 46 14.529297 bob.dc3.com svn.dc3.com TCP:Flags=...AP..., SrcPort=4206 ...
> 47 14.611328 svn.dc3.com bob.dc3.com TCP:Flags=...AP..., SrcPort=7822 ...
> 48 14.613281 bob.dc3.com svn.dc3.com TCP:Flags=...AP..., SrcPort=4206 ...
> 49 14.693359 svn.dc3.com bob.dc3.com TCP:Flags=...AP..., SrcPort=7822 ...
> 50 14.714844 bob.dc3.com svn.dc3.com TCP:Flags=...A.R.., SrcPort=4206 ...
>
> The 'R' flag means a reset packet. The remote sshd tried to send an
> ACK to me, and my tunnel was killed, so my TCP stack sent back this
> reset packet saying "forget it, this TCP connection is gone". sshd
> goes into bozo mode and leaves its svnserve subprocess running.
>
> A tail of the tunnel packet log (from an 'svn ls' command) looks like this:
>
> 00000210 29 20 28 20 37 3a 69 6e 73 74 72 65 67 20 64 69 ) ( 7:instreg di
> 00000220 72 20 30 20 66 61 6c 73 65 20 30 20 28 20 32 37 r 0 false 0 ( 27
> 00000230 3a 31 39 37 30 2d 30 31 2d 30 31 54 30 30 3a 30 :1970-01-01T00:0
> 00000240 30 3a 30 30 2e 30 30 30 30 30 30 5a 20 29 20 28 0:00.000000Z ) (
> 00000250 20 29 20 29 20 28 20 37 3a 63 6c 61 70 61 63 6b ) ) ( 7:clapack
> 00000260 20 64 69 72 20 30 20 66 61 6c 73 65 20 30 20 28 dir 0 false 0 (
> 00000270 20 32 37 3a 31 39 37 30 2d 30 31 2d 30 31 54 30 27:1970-01-01T0
> 00000280 30 3a 30 30 3a 30 30 2e 30 30 30 30 30 30 5a 20 0:00:00.000000Z
> 00000290 29 20 28 20 29 20 29 20 29 20 29 20 29 20 ) ( ) ) ) ) )
> Outgoing packet type 96 / 0x60 (SSH2_MSG_CHANNEL_EOF)
> 00000000 00 00 00 00 ....
>
> I changed the line of code in client.c to
>
> apr_pool_note_subprocess(pool, proc, APR_KILL_NEVER);
>
> and the problem is cured.

But this creates another problem which killing the subprocess is
meant to cure. If long-lived Subversion clients (e.g. IDE integrations)
using the SVN libraries open a connection via svn+ssh:// it's possible
that the ssh subprocesses created never die. Over time, ssh processes
accumulate on the client workstation, doing nothing but waiting to be
killed.

See the comment above the line you're quoting:

  /* Arrange for the tunnel agent to get a SIGTERM on pool
   * cleanup. This is a little extreme, but the alternatives
   * weren't working out.
   *
   * Closing the pipes and waiting for the process to die
   * was prone to mysterious hangs which are difficult to
   * diagnose (e.g. svnserve dumps core due to unrelated bug;
   * sshd goes into zombie state; ssh connection is never
   * closed; ssh never terminates).
   * See also the long dicussion in issue #2580 if you really
   * want to know various reasons for these problems and
   * the different opinions on this issue.
   */

> Here is the tail of a packet trace showing the proper TCP close:
>
> 42 13.656250 svn.dc3.com bob.dc3.com TCP:Flags=...AP..., SrcPort=7822 ...
> 43 13.656250 bob.dc3.com svn.dc3.com TCP:Flags=...AP..., SrcPort=4197 ...
> 44 13.734375 svn.dc3.com bob.dc3.com TCP:Flags=...AP..., SrcPort=7822 ...
> 45 13.734375 bob.dc3.com svn.dc3.com TCP:Flags=...AP..., SrcPort=4197 ...
> 46 13.812500 svn.dc3.com bob.dc3.com TCP:Flags=...AP..., SrcPort=7822 ...
> 47 13.828125 bob.dc3.com svn.dc3.com TCP:Flags=...AP..., SrcPort=4197 ...
> 48 13.906250 svn.dc3.com bob.dc3.com TCP:Flags=...AP..., SrcPort=7822 ...
> 49 13.906250 bob.dc3.com svn.dc3.com TCP:Flags=...AP..., SrcPort=4197 ...
> 50 13.906250 bob.dc3.com svn.dc3.com TCP:Flags=...A...F, SrcPort=4197 ...
> 51 13.984375 svn.dc3.com bob.dc3.com TCP:Flags=...A...., SrcPort=7822 ...
> 52 13.984375 svn.dc3.com bob.dc3.com TCP:Flags=...A...., SrcPort=7822 ...
> 53 13.984375 svn.dc3.com bob.dc3.com TCP:Flags=...A...F, SrcPort=7822 ...
> 54 13.984375 bob.dc3.com svn.dc3.com TCP:Flags=...A...., SrcPort=4197 ...
>
> You can see the tunnel and the remote sshd exchanging FIN packets. And here is the tail of the tunnel packet log for the svn ls command:
>
> 00000210 29 20 28 20 37 3a 69 6e 73 74 72 65 67 20 64 69 ) ( 7:instreg di
> 00000220 72 20 30 20 66 61 6c 73 65 20 30 20 28 20 32 37 r 0 false 0 ( 27
> 00000230 3a 31 39 37 30 2d 30 31 2d 30 31 54 30 30 3a 30 :1970-01-01T00:0
> 00000240 30 3a 30 30 2e 30 30 30 30 30 30 5a 20 29 20 28 0:00.000000Z ) (
> 00000250 20 29 20 29 20 28 20 37 3a 63 6c 61 70 61 63 6b ) ) ( 7:clapack
> 00000260 20 64 69 72 20 30 20 66 61 6c 73 65 20 30 20 28 dir 0 false 0 (
> 00000270 20 32 37 3a 31 39 37 30 2d 30 31 2d 30 31 54 30 27:1970-01-01T0
> 00000280 30 3a 30 30 3a 30 30 2e 30 30 30 30 30 30 5a 20 0:00:00.000000Z
> 00000290 29 20 28 20 29 20 29 20 29 20 29 20 29 20 ) ( ) ) ) ) )
> Outgoing packet type 96 / 0x60 (SSH2_MSG_CHANNEL_EOF)
> 00000000 00 00 00 00 ....
> Event Log: Sent EOF message
> Incoming packet type 96 / 0x60 (SSH2_MSG_CHANNEL_EOF)
> 00000000 00 00 01 00 ....
> Incoming packet type 98 / 0x62 (SSH2_MSG_CHANNEL_REQUEST)
> 00000000 00 00 01 00 00 00 00 0b 65 78 69 74 2d 73 74 61 ........exit-sta
> 00000010 74 75 73 00 00 00 00 00 tus.....
> Event Log: Server sent command exit status 0
> Incoming packet type 97 / 0x61 (SSH2_MSG_CHANNEL_CLOSE)
> 00000000 00 00 01 00 ....
> Outgoing packet type 97 / 0x61 (SSH2_MSG_CHANNEL_CLOSE)
> 00000000 00 00 00 00 ....
> Event Log: Disconnected: All channels closed
>
> You can see that the tunnel was allowed to complete its activity and
> exit through its normal paths. The remote sshd/svnserve processes exit
> cleanly as well. Problem solved!
>
> RECOMMENDATIONS:
>
> The cleanest and safest way to handle this would seem to be:
>
> 1. Modify apr to support the kill mode APR_KILL_AFTER_TIMEOUT *on
> Windows*. This would cause the tunnel to be killed after three
> seconds, presumably plenty of time.
>
> 2. Modify libsvn_ra_svn\client.c to use APR_KILL_AFTER_TIMEOUT *on
> Windows* instead of APR_KILL_ONCE.
>
> However, patching apr is probably not acceptable, so instead:

Patching APR is certainly acceptable.
Would you be willing to work on a patch of APR and submit it there?

> 1. Modify libsvn_ra_svn\client.c to use APR_KILL_NEVER *on Windows*
> instead of APR_KILL_ONCE.
>
> What do you think?

KILL_NEVER is not a good solution. If it was, it would already be done
this way, and we'd never have had to mess around with killing SSH
processes in the first place.

Stefan

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2408224
Received on 2009-10-16 15:32:01 CEST

This is an archived mail posted to the Subversion Dev mailing list.