[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] fix non interruptable hang in svn client when connecting

From: Yun Zheng Hu <yunzheng.hu_at_gmail.com>
Date: 2005-08-25 17:39:01 CEST

Sorry for the brackets and not using an unified diff, I think I
misinterpreted the patch submission guidelines a bit :) wont happen
again.

As for the timeout, its seems OS dependant, its even shorter (~1
minute) on OS X:

$ time svn co svn://svn.edgewall.com
svn: Can't connect to host 'svn.edgewall.com': Operation timed out

real 1m14.639s
user 0m0.004s
sys 0m0.011s

$ time svn co svn://svn.edgewall.com
svn: Can't connect to host 'svn.edgewall.com': Operation timed out

real 1m14.908s
user 0m0.004s
sys 0m0.010s

But here's how to reproduce the hanging process bug:

Do the same command, but interrupt it with a ctrl-c (one or multiple
times, doesnt matter):
$ time svn co svn://svn.edgewall.com
^C

This will result in a hanging process, at least on OS X.
Its also not killable with a 'kill -9' after you interrupt it, because
the signal_handler is setup to ignore future interrupts (SIG_IGN).
Ofcourse you could not interrupt it and let it timeout properly, but
usually the user knows he made a typo of some kind and want to kill
the process asap.

The bug might also be related to apr_signal, but havent really tested yet.

By setting a timeout on the socket the process will become killable
again, the timeout can be 10 seconds or 4 minutes or whatever, as long
as we have a timeout.
If we dont set a timeout, the apr_connect function will use a blocking
socket connect but does not timeout if the user sent an interrupt (the
bug).
If we do set a timeout it will be a nonblocking connect, and resort to
apr_poll for proper timeout (see apr/network_io/unix/sockets.c).

On 24 Aug 2005 22:24:03 -0500, kfogel@collab.net <kfogel@collab.net> wrote:
> Yun Zheng Hu <yunzheng.hu@gmail.com> writes:
> > [[[
> > When you connect to a non existing repository, but the host exists,
> > the apr_socket_connect will loop forever somehow, and is NOT
> > interruptable.
> > On OS X this will result in a inresponsive system, which is very bad.
> >
> > How to reproduce:
> > svn co svn://svn.edgewall.com
> >
> > There is no svn server running on that port, so try a CTRL-C to
> > cancel, but its not possible.. it will loop forever. process is not
> > killable on OS X either.
> >
> > I found out that somebody else was having this problem too, but could
> > not find a bug report and neither the fix. The user that was having
> > this same bug can be read on:
> >
> > http://svn.haxx.se/users/archive-2005-07/1027.shtml
> >
> > The following patch will set a timeout of 10 seconds on apr_socket_connect.
> > So if it cant connect it will timeout safely, 10 seconds should be
> > more than enough I think.
> > The patch will also include the port number in the error message, it
> > might be useful or not..
> > ]]]
>
> Thanks for the patch. I have some technical comments below. Also,
> there's no need to use the "[[[ ... ]]]" convention unless you are
> actually putting a log message inside the braces. (You put the main
> body of your message there instead.)
>
> > The patch is very simple, open up:
> > subversion/libsvn_ra_svn/client.c
> >
> > search for: "apr_socket_connect", and change the code to:
> >
> > status = apr_setsocketopt(*sock, APR_SO_TIMEOUT, 10 * APR_USEC_PER_SEC);
> > if (status)
> > return svn_error_wrap_apr(status, _("Can't set timeout"));
> >
> > status = apr_socket_connect(*sock, sa);
> > if (status)
> > return svn_error_wrap_apr(status, _("Can't connect to host '%s' on
> > port %u"),
> > hostname, port);
> >
>
> I think we shouldn't mix the port-number-in-error-message change with
> the timeout change, as they are two unrelated things.
>
> Also, do you know how to generate a real patch, that is, something
> that can be applied using the 'patch' program? Just make your
> modifications and run 'svn diff'. The result will be a patch, in the
> format expected by all open source projects.
>
> Now the technical notes:
>
> apr_setsocketopt() is deprecated, and recommends the use of
> apr_socket_set_opt() instead. However, if you look at the
> documentation for apr_socket_set_opt(), the APR_SO_TIMEOUT option is
> not listed as one of the valid options. Instead, one should use
> apr_socket_set_timeout()! :-)
>
> So that's what I've done in the patch below.
>
> However, I'm not sure we should apply this patch; at least, I'd like
> other developer's thoughts. There already seems to be a default
> timeout, it's just long. It's around 3 minutes, according to my
> tests:
>
> $ time svn co svn://svn.edgewall.com
> subversion/libsvn_ra_svn/client.c:141: (apr_err=110)
> svn: Can't connect to host 'svn.edgewall.com': Connection timed out
>
> real 3m9.548s
> user 0m0.020s
> sys 0m0.008s
>
> $ time svn co svn://svn.edgewall.com
> subversion/libsvn_ra_svn/client.c:141: (apr_err=110)
> svn: Can't connect to host 'svn.edgewall.com': Connection timed out
>
> real 3m9.056s
> user 0m0.022s
> sys 0m0.005s
>
> $
>
> Now, if we lower that timeout, we're rejecting the default chosen by
> APR, without doing anything to solve the root of the problem, which is
> the lack of interruptability. On the other hand, the lack of
> interruptability is sort of our fault, since we handle signals
> ourselves. We have to run our check-cancellation function to actually
> do anything with a signal, and that means we lose whenever we go into
> an APR function like apr_socket_connect() :-(.
>
> I'm not sure what the right answer is here. Should we lower the
> socket timeout to 10 seconds, since we have made apr_socket_connect()
> uninterruptable? Anyone have any thoughts?
>
> Here's the new patch, anyway:
>
> [[[
> Fix a bug whereby connecting to a non-existent repository can cause
> apr_socket_connect() to hang non-interruptably. To reproduce, try
> 'svn co svn://svn.edgewall.com'.
>
> Patch by: Yun Zheng Hu <yunzheng.hu@gmail.com>
> (Tweaked by me to use apr_socket_set_timeout instead of apr_setsocketopt.)
>
> * subversion/libsvn_ra_svn/client.c
> (make_connection): Set socket timeout to 10 seconds.
>
> Yun Zheng Hu's original message:
>
> http://subversion.tigris.org/servlets/ReadMsg?list=dev&msgNo=104587
> From: Yun Zheng Hu <yunzheng.hu@gmail.com>
> To: dev@subversion.tigris.org
> Subject: [PATCH] fix non interruptable hang in svn client when connecting
> Date: Thu, 25 Aug 2005 00:40:07 +0200
> Message-ID: <df84014e05082415405cdd9b13@mail.gmail.com>
> ]]]
>
> Index: subversion/libsvn_ra_svn/client.c
> ===================================================================
> --- subversion/libsvn_ra_svn/client.c (revision 15898)
> +++ subversion/libsvn_ra_svn/client.c (working copy)
> @@ -136,6 +136,10 @@
> if (status)
> return svn_error_wrap_apr(status, _("Can't create socket"));
>
> + status = apr_socket_timeout_set(*sock, 10 * APR_USEC_PER_SEC);
> + if (status)
> + return svn_error_wrap_apr(status, _("Can't set timeout"));
> +
> status = apr_socket_connect(*sock, sa);
> if (status)
> return svn_error_wrap_apr(status, _("Can't connect to host '%s'"),
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Aug 25 17:40:00 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.