[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Error "An existing connection was forcibly closed by the remote host" with F5 content switch or working copy on shared drive

From: Justin Johnson <justin_at_honesthacker.com>
Date: Wed, 17 Mar 2010 06:47:04 -0500

On Fri, Mar 12, 2010 at 9:34 AM, Justin Johnson <justin_at_honesthacker.com>wrote:

> Hi,
>
> I'm trying to understand why the following error occurs.
>
> svn: REPORT request failed on '/svn/reponame/!svn/vcc/default'
> svn: REPORT of '/svn/reponame/!svn/vcc/default': Could not read response
> body: An existing connection was forcibly closed by the remote host. (
> http://HOSTNAME <http://hostname/>)
> command exit code: 1
>
> I've seen this error in a couple of scenarios:
> 1) when performing a checkout on a Windows box with the working copy stored
> on a drive mapped to a NAS share
> 2) when performing a checkout on a Windows box and the server is an F5
> content switch that just redirects traffic to the Subversion server
>
> The first scenario is of less concern to me, but I mention it anyway since
> I think it is the same problem.
>
> For the second scenario, I worked with someone on our networking team to
> understand the problem. What he discovered and how he "resolved" it with
> our F5 content switch can be found below. The server is running Solaris 10,
> Subversion 1.6.6, Apache 2.2.11, and repositories are served via HTTP. The
> client is running Windows XP SP3 and Subversion 1.6.7 (error occurs with
> TortoiseSVN as well), but the error also occurs on Windows Server 2003. I
> haven't tested any other Windows client OSes and haven't seen the error on
> UNIX, but suspect the underlying problem may exist there and the OS handles
> it more gracefully. Here is the explanation by my networking contact.
>
> ****
> The problem that is presenting is that the client's receive buffer is
> filling up and staying full for a long period of time. When this occurs, he
> advertises a tcp window size of 0 in packets he sends to the destination
> F5. This also happens when he goes directly against a server. The server
> seems to tolerate it while the F5 does not.
>
> Last year, I took traces of the traffic against the server by the client
> directly, and through the F5, and saw that the server was seeing different
> MTU and options from the F5. I modified the standard TCP profile on the F5
> to have it proxy the TCP options the client offered so the server would get
> them. I also set it to proxy the MTU setting the client offered. This
> seemed to have fixed the problem at that time. But your current testing
> failed.
>
> Upon closer inspection, I determined that the F5 was resetting the
> connections, not the server as I had previously thought. This time, I
> turned off those two options from last year and increased the Maximum
> Segment Retransmissions from the default of 8 to 16. This controls the
> number of times the F5 resends a packet after it gets no response. This
> also controls the zero window probes he sends to see if the client can
> receive data yet. TCP uses a back-off algorithm and increases the time
> between retries. With 8 attempts, the total retry time is just over a
> minute. I suspect retries of 16 will cause it to retry for 5 or 10 minutes.
>
> I would really like to get this in front of SVN developers, because
> something is getting hosed on the client that causes him to stop pulling off
> the receive buffer. If the zero window lasted 10 seconds or so, it would
> not be a problem. But for him to in effect go offline for over a minute is,
> I believe, a bug. We can just assume that the reason the error does not
> occur when you hit the server directly is that the Sun box handles the zero
> window issue differently, or it might just retry more than 8 times by
> default. Might be a question for the UNIX team as to the retry count. If
> we get some time, we could do some packet captures and find out for certain.
>
> Yesterday and today, I did a few other things that *did not *help. I
> increased the TCP receive buffers on the client side sessions, then on the
> server side sessions, then both. I then turned off all of the tcp options
> in the F5 default TCP profile.
> ****
>
> So, in summary, my problem is currently "resolved" by increasing the
> Maximum Segment Retransmissions from the default of 8 to 16 on the F5.
> However, as I mentioned above I've seen this problem when connecting
> directly to the Subversion server and storing the working copy on a network
> drive.
>
> Does anyone have any ideas? Is this something that can be fixed in the
> Subversion code itself?
>
> Thanks.
> Justin
>
>
No responses? This seems like something more for the dev list, but I wanted
to follow protocol and wait for a response from the users list first.

Thanks.
Justin
Received on 2010-03-17 12:47:37 CET

This is an archived mail posted to the Subversion Users mailing list.