[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [Issue 3347] svn never times out when (public) IP changes

From: Stefan Sperling <stsp_at_elego.de>
Date: Thu, 28 May 2009 22:14:57 +0100

On Wed, May 27, 2009 at 10:34:00PM +0100, Stefan Sperling wrote:
> Right. So we would be using keep-alives as an out-of-band mechanism,
> in the sense that we ask TCP to please tell us if we get a reset from
> the other side when sending non-svn-protocol messages. The svn protocol
> itself is then isolated from having to deal with this network-layer
> problem.
> I see your point and will try to find a way to do this.

Here's a patch.

I've tested it by:

1) Configuring my kernel to start sending keep-alives after 20 seconds
   instead of 2 hours (the default).

2) Running a large checkout over svn:// from a remote server.

3) Causing the connection to stall by making my router drop any packets
   sent from the server to the client. This simulates the situation were
   the client is no longer able to receive traffic from the server because
   its connection to the public internet has gone down.

4) Waiting for some time.

5) Inserting another firewall rule which sends a TCP RST packet for any
   TCP packet going to the server. This simulates the situation were the
   client's public internet connection comes back, but with a different
   IP address assigned to it by the ISP, so that the server will reply
   with a RST because it does not recognize the client anymore.

6) All the way through, monitoring the traffic using tcpdump, making
   sure that there is no traffic generated by the client itself,
   because this would cause keep-alives not to be sent. Also, making
   sure that the kernel did send keep-alives after 20 seconds. On Linux,
   these seem to show up as a succession of a couple of ACKs for the
   same segment sent from the client to the server. At least that is
   what I saw and it is sufficiently different from the other traffic
   to signify that keep-alives were in fact being used.

Without the patch, svn just sat there waiting seemingly forever,
even after the RST firewall rule was inserted.

With the patch, svn threw an error after the quick succession of
ACKS appearing in the tcpdump output:

subversion/svn/checkout-cmd.c:168: (apr_err=104)
subversion/libsvn_client/checkout.c:224: (apr_err=104)
subversion/libsvn_client/update.c:275: (apr_err=104)
subversion/libsvn_ra_svn/client.c:281: (apr_err=104)
subversion/libsvn_ra_svn/editorp.c:877: (apr_err=104)
subversion/libsvn_ra_svn/marshal.c:793: (apr_err=104)
subversion/libsvn_ra_svn/marshal.c:644: (apr_err=104)
subversion/libsvn_ra_svn/marshal.c:644: (apr_err=104)
subversion/libsvn_ra_svn/marshal.c:644: (apr_err=104)
subversion/libsvn_ra_svn/marshal.c:610: (apr_err=104)
subversion/libsvn_ra_svn/marshal.c:297: (apr_err=104)
subversion/libsvn_ra_svn/marshal.c:287: (apr_err=104)
subversion/libsvn_ra_svn/marshal.c:262: (apr_err=104)
subversion/libsvn_ra_svn/streams.c:152: (apr_err=104)
svn: Can't read from connection: Connection reset by peer

The error originates at the call to apr_socket_recv().

Do you agree that this patch is OK and addresses issue #3347 correctly?


Fix issue #3347, "svn never times out when (public) IP changes"

* subversion/libsvn_ra_svn/client.c
  (make_connection): After creating the socket, try to enable TCP
   keep-alives on it so that we have a chance of detecting connection
   problems to the server while we are blocking indefinitely reading
   from the socket.

Suggested by: ghudson

Index: subversion/libsvn_ra_svn/client.c
--- subversion/libsvn_ra_svn/client.c (revision 37867)
+++ subversion/libsvn_ra_svn/client.c (working copy)
@@ -161,6 +161,21 @@ static svn_error_t *make_connection(const char *ho
     return svn_error_wrap_apr(status, _("Can't connect to host '%s'"),
+ /* Enable TCP keep-alives on the socket so we time out when
+ * the connection breaks due to network-layer problems.
+ * If the peer has dropped the connection due to a network partition
+ * or a crash, or if the peer no longer considers the connection
+ * valid because we are behind a NAT and our public IP has changed,
+ * it will respond to the keep-alive probe with a RST instead of an
+ * acknowledgment segment, which will cause svn to abort the session
+ * even while it is currently blocked waiting for data from the peer.
+ * See issue #3347. */
+ status = apr_socket_opt_set(*sock, APR_SO_KEEPALIVE, 1);
+ if (status)
+ {
+ /* It's not a fatal error if we cannot enable keep-alives. */
+ }
   return SVN_NO_ERROR;
Received on 2009-05-28 23:15:24 CEST

This is an archived mail posted to the Subversion Dev mailing list.