[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] fix non interruptable hang in svn client when connecting

From: Malcolm Rowe <malcolm-svn-dev_at_farside.org.uk>
Date: 2005-08-26 01:02:02 CEST

On Thu, Aug 25, 2005 at 04:07:12PM -0500, kfogel@collab.net wrote:
> Phillip Susi <psusi@cfl.rr.com> writes:
> > 3) What the heck is wrong with apr_connect? The call should return
> > almost immediately assuming the remote host is alive but has nothing
> > listening on that port. If the remote host is not reachable then the
> > connect call should return after some reasonable timeout period, which
> > is usually between 15 and 60 seconds. In the former case it should
> > indicate that the connection was refused, and in the latter, that the
> > connection timed out. In both cases, SIGINT/ctrl-c should interrupt
> > and kill the process, and obviously SIGKILL/TerminateProcess() most
> > definately should.
> Another thing that concerns me is that Ben Collins-Sussman, on his Mac
> notebook (OS X 10.4, Darwin kernel 8.2.0), couldn't reproduce most of
> this bug. He ran
>
> $ svn co http://svn.edgewall.com
>
> It hung, of course. He tried Ctrl-C several times, and it didn't
> respond. But when we finally did "kill -9 PID", that killed it.
> There seem to be no ill effects on the network stack from that.

Strange. On Mac OS 10.4.2 (Darwin 8.2.1) I can reproduce the problem.
Note: the original example was an svn:// URL, not an http:// one. I doubt
it matters, though.

I assume that the server isn't responding at all, even with an ICMP message,
so the OS-default timeout applies (~75 seconds on Darwin, ~3 mins on Linux).

There is still one thing that I would have thought would be a problem.
APR restarts any connect() calls that fail with EINTR automatically
(see apr_socket_connect() in apr/network_io/unix/sockets.c), and so I would
have expected that sending a SIGINT during connection would actually
result in a _longer_ time, as APR would restart the connection with a new
timeout, and we can't look at the cancellation flag until after
apr_socket_connect() returns, if at all).

However, on my Linux box, this doesn't seem to happen. I still can't cancel
a pending connect() (APR just restarts it and ignores all future SIGINTs),
but the amount of time spent is exactly the same whether I the first
connect() is interrupted or not. Strange.

I'm guessing it would be too hard to change things so that we don't rely on
APR to restart the connect() (and thus could look at the cancellation flag
before we retry)?

> If it really has the effect that I thought it had, then it
> would be good, but so far that effect seems to only exist on Yun
> Zheng's computer. Hmmm.

(if the effect is the system hanging, I get that too. I've not tested r15909.)

Regards,
Malcolm

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Aug 26 01:04:14 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.