[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Re: Re: svn+ssh leaves sshd and svnserve processes running

From: Bob Denny <rdenny_at_dc3.com>
Date: Mon, 12 Oct 2009 18:16:15 -0700 (PDT)

OK, I've done a whole bunch more investigating, including spending the day yesterday with the sources of TortoisePLink from TortoiseCVS (which has the source openly available, thank goodness). That version is marked 1.9.14.2, while the one that comes with TortoiseSVN is marked 0.59.0.0. I built it in VS2008 and started in with the debugger to see what was going on. Again, this is svn+ssh to a remote 1.6.5 subversion server with OpenSSH 4.3p2.

Short answer: TortpisePLink is exiting before it has a chance to properly close the TCP connection to the remote SSH/svnserve, leaving ghost copies of these processes running 'forever' on the remote server. They pile up as Tortoise activity progresses. Why is still a mystery, it doesn't _appear_ to be a logic flaw in TortoisePLink!

As I observed in a previous message, the problem is limited to my XP Pro 64-bit quad-core 2GHz system. I rigged up the command line SVN for an 'ls' command, and added a popup message box at the start of PLink to allow me to attach the debugger after svn.exe started TortoisePLink.exe. After figuring out the threading scheme for managing back-end and client I/O, I discovered that by stopping the PLink program in the debugger near the end of its cycle, the protocol completed and the remote sshd and svnserve processes disappeared correctly. Specifically, I set a breakpoint when the response to the get-dir command (the ls listing) came back from the remote svn server.

What I have determined is that the PLink program "just exits" right at the end, presumably when its parent, svn.exe, closes its handles to PLink's stdin/stdout and exits. Why?? I put breakpoints at every call to exit() in PLink, and they never got hit. Single stepping, toward the very end of the cycle, the PLink program just seems to exit for no reason! In any case, this early exit on my fast system comes so fast that PLink doesn't get a chance to cleanly close the TCP connection to the remote SSH daemon and thus sshd and its child svnserve get left out there swinging in the breeze.

I'll try to dig into this more in the future, but at present all I can say is that Tortoise PLink is involved. It's not exiting through it's logic. For reference, here's the tail of a log from my notebook, a slow 32-bit single core system:

- - - - - - - -
Outgoing packet type 94 / 0x5e (SSH2_MSG_CHANNEL_DATA)
  00000000 00 00 00 00 00 00 00 2e 28 20 67 65 74 2d 64 69 ........( get-di
  00000010 72 20 28 20 30 3a 20 28 20 34 39 20 29 20 66 61 r ( 0: ( 49 ) fa
  00000020 6c 73 65 20 74 72 75 65 20 28 20 6b 69 6e 64 20 lse true ( kind
  00000030 29 20 29 20 29 20 ) ) )
Incoming packet type 94 / 0x5e (SSH2_MSG_CHANNEL_DATA)
  00000000 00 00 01 00 00 00 02 96 28 20 73 75 63 63 65 73 ........( succes
  00000010 73 20 28 20 28 20 29 20 30 3a 20 29 20 29 20 28 s ( ( ) 0: ) ) (
  00000020 20 73 75 63 63 65 73 73 20 28 20 34 39 20 28 20 success ( 49 (
  00000030 29 20 28 20 28 20 36 3a 72 65 67 74 6c 62 20 64 ) ( ( 6:regtlb d
  00000040 69 72 20 30 20 66 61 6c 73 65 20 30 20 28 20 32 ir 0 false 0 ( 2
  00000050 37 3a 31 39 37 30 2d 30 31 2d 30 31 54 30 30 3a 7:1970-01-01T00:
  00000060 30 30 3a 30 30 2e 30 30 30 30 30 30 5a 20 29 20 00:00.000000Z )
  00000070 28 20 29 20 29 20 28 20 37 3a 63 66 69 74 73 69 ( ) ) ( 7:cfitsi
  00000080 6f 20 64 69 72 20 30 20 66 61 6c 73 65 20 30 20 o dir 0 false 0
  00000090 28 20 32 37 3a 31 39 37 30 2d 30 31 2d 30 31 54 ( 27:1970-01-01T
  000000a0 30 30 3a 30 30 3a 30 30 2e 30 30 30 30 30 30 5a 00:00:00.000000Z
  000000b0 20 29 20 28 20 29 20 29 20 28 20 31 31 3a 74 73 ) ( ) ) ( 11:ts
  000000c0 76 6e 2d 67 65 6d 69 6e 69 20 64 69 72 20 30 20 vn-gemini dir 0
  000000d0 66 61 6c 73 65 20 30 20 28 20 32 37 3a 31 39 37 false 0 ( 27:197
  000000e0 30 2d 30 31 2d 30 31 54 30 30 3a 30 30 3a 30 30 0-01-01T00:00:00
  000000f0 2e 30 30 30 30 30 30 5a 20 29 20 28 20 29 20 29 .000000Z ) ( ) )
  00000100 20 28 20 38 3a 72 65 6d 6f 74 69 6e 67 20 64 69 ( 8:remoting di
  00000110 72 20 30 20 66 61 6c 73 65 20 30 20 28 20 32 37 r 0 false 0 ( 27
  00000120 3a 31 39 37 30 2d 30 31 2d 30 31 54 30 30 3a 30 :1970-01-01T00:0
  00000130 30 3a 30 30 2e 30 30 30 30 30 30 5a 20 29 20 28 0:00.000000Z ) (
  00000140 20 29 20 29 20 28 20 36 3a 64 63 33 72 65 67 20 ) ) ( 6:dc3reg
  00000150 64 69 72 20 30 20 66 61 6c 73 65 20 30 20 28 20 dir 0 false 0 (
  00000160 32 37 3a 31 39 37 30 2d 30 31 2d 30 31 54 30 30 27:1970-01-01T00
  00000170 3a 30 30 3a 30 30 2e 30 30 30 30 30 30 5a 20 29 :00:00.000000Z )
  00000180 20 28 20 29 20 29 20 28 20 31 30 3a 74 68 69 6e ( ) ) ( 10:thin
  00000190 73 63 72 69 70 74 20 64 69 72 20 30 20 66 61 6c script dir 0 fal
  000001a0 73 65 20 30 20 28 20 32 37 3a 31 39 37 30 2d 30 se 0 ( 27:1970-0
  000001b0 31 2d 30 31 54 30 30 3a 30 30 3a 30 30 2e 30 30 1-01T00:00:00.00
  000001c0 30 30 30 30 5a 20 29 20 28 20 29 20 29 20 28 20 0000Z ) ( ) ) (
  000001d0 38 3a 73 6d 74 70 63 74 72 6c 20 64 69 72 20 30 8:smtpctrl dir 0
  000001e0 20 66 61 6c 73 65 20 30 20 28 20 32 37 3a 31 39 false 0 ( 27:19
  000001f0 37 30 2d 30 31 2d 30 31 54 30 30 3a 30 30 3a 30 70-01-01T00:00:0
  00000200 30 2e 30 30 30 30 30 30 5a 20 29 20 28 20 29 20 0.000000Z ) ( )
  00000210 29 20 28 20 37 3a 69 6e 73 74 72 65 67 20 64 69 ) ( 7:instreg di
  00000220 72 20 30 20 66 61 6c 73 65 20 30 20 28 20 32 37 r 0 false 0 ( 27
  00000230 3a 31 39 37 30 2d 30 31 2d 30 31 54 30 30 3a 30 :1970-01-01T00:0
  00000240 30 3a 30 30 2e 30 30 30 30 30 30 5a 20 29 20 28 0:00.000000Z ) (
  00000250 20 29 20 29 20 28 20 37 3a 63 6c 61 70 61 63 6b ) ) ( 7:clapack
  00000260 20 64 69 72 20 30 20 66 61 6c 73 65 20 30 20 28 dir 0 false 0 (
  00000270 20 32 37 3a 31 39 37 30 2d 30 31 2d 30 31 54 30 27:1970-01-01T0
  00000280 30 3a 30 30 3a 30 30 2e 30 30 30 30 30 30 5a 20 0:00:00.000000Z
  00000290 29 20 28 20 29 20 29 20 29 20 29 20 29 20 ) ( ) ) ) ) )
Outgoing packet type 93 / 0x5d (SSH2_MSG_CHANNEL_WINDOW_ADJUST)
  00000000 00 00 00 00 00 00 02 00 ........
Outgoing packet type 93 / 0x5d (SSH2_MSG_CHANNEL_WINDOW_ADJUST)
  00000000 00 00 00 00 00 00 00 96 ........
Outgoing packet type 96 / 0x60 (SSH2_MSG_CHANNEL_EOF)
  00000000 00 00 00 00 ....
- - - - - - - -

and on the 64-bit quadcore the log ends with

- - - - - - - - -
  [...]
  00000260 20 64 69 72 20 30 20 66 61 6c 73 65 20 30 20 28 dir 0 false 0 (
  00000270 20 32 37 3a 31 39 37 30 2d 30 31 2d 30 31 54 30 27:1970-01-01T0
  00000280 30 3a 30 30 3a 30 30 2e 30 30 30 30 30 30 5a 20 0:00:00.000000Z
  00000290 29 20 28 20 29 20 29 20 29 20 29 20 29 20 ) ( ) ) ) ) )
Outgoing packet type 93 / 0x5d (SSH2_MSG_CHANNEL_WINDOW_ADJUST)
  00000000 00 00 00 00 00 00 02 00 ........
- - - - - - - -

The second window adjust and the EOF are not done/logged, and the remote processes are left running. Process Monitor (SysInternals) shows all of the threads "just exiting" and an exit status of PLink of 9. What is that???

I suspect that PLink is dying as a result of its parent svn.exe exiting. But that's not supposed to happen, at least as far as I know! Maybe it's some MS "security feature" that a child process cannot run to completion after its parent exits? I'm gonna feel stupid if this is the case!

  -- Bob

------------------------------------------------------
http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=4061&dsMessageId=2406873

To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_tortoisesvn.tigris.org].
Received on 2009-10-13 23:14:34 CEST

This is an archived mail posted to the TortoiseSVN Users mailing list.