[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svnserve under Linux inetd hangs, burning CPU cycles, under too low TCP-sendbuffer.

From: Dr. Andreas Krüger <andreas.krueger_at_dv-ratio.com>
Date: Tue, 02 Mar 2010 14:03:37 +0100

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hallo, Philip,

management summary: It does look like svnserve is innocent and
openvz's write call is to blame after all!

Firstly, I have now been able to move this entire affair off my
internet "production" server.

The bug is reproducible locally, on my "home" machine, with a plain
Debian sid openvz guest (hosted by a plain Debian lenny host). Even
better: svn client and svn server on that same openvz guest machine
reproduce this bug! That is, a simple "svn co svn://localhost bug"
does the trick!

I also tried the other direction: A big "svn ci" to svn://localhost .
This also hangs, but this time on the svn (client) side.

> Somebody (you probably) needs to debug it.

Ooouch... Have not touched gdb for some 10 years now...

But, if debug we must, debug we must. I thought. So I dutifully
grabbed the code from http://svn.apache.org/repos/asf/subversion/trunk
and compiled that, and could reproduce the bug with that as well.

Just short of digging up the gdb documentation, I did one final
"strace -f -ff -o ...".

Interestingly, while svnserve was burning CPU cycles like crazy, the
trace log of that particular process did not grow!

The last syscall which strace had logged was a write call. Which had
not yet written its return value. This is what it looked like:

write(4, "7\310 lots of data skipped here"..., 99321

So it seems svnserve is hanging, burning CPU cycles, FROM INSIDE A
WRITE SYSTEM CALL!

Which got interrupted when I finally killed the server. Now there was
a return value, indicating a partial write:

write(4, "7\310 lots of data skipped here"...,
99321) = 44701
- --- SIGTERM (Terminated) @ 0 (0) ---

> As I understand it OpenVZ involves a custom kernel with it's own
network drivers,
> perhaps the write syscall is doing something unexpected.

I now think you are right on target with that one! It does look like
we need to blame openvz or the Linux kernel, after all.

I'll try to cobble together a few-line C server and see whether that
reproduces the bug. I'll let you know whether the result of that
experiment matches what we now think is going on.

So, for now, it looks like no svn bug after all.

Regards, Andreas

- --
Dr. Andreas Krüger, Berater, DV-RATIO NORDWEST GmbH
andreas.krueger_at_dv-ratio.com
GPG/PGP Fingerprint 8063 4A9B 362D 4220 A546 14C1 EA19 AADC FD44 5EB7

DV-RATIO NORDWEST GmbH
Tel: +49 (0)211 / 577 996-0
Fax: +49 (0)211 / 577 996-26
http://www.dv-ratio.com <http://www.dv-ratio.com>
Sitz der Gesellschaft Habsburgerstraße 12, 40547 Düsseldorf
Registergericht Düsseldorf HRB 34330
USt-IdNr.: DE811321837
Steuer-Nr.: 809/44031
Geschäftsführung: Günter Gerstmann
Prokura: Trudbert Vetter, Uwe Wolfram

DV-RATIO - "Kompetenz und Zuverlässigkeit seit 1980"

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEUEARECAAYFAkuNDKUACgkQ6hmq3P1EXre2WgCXb56x2wO69Yv67TFZDdMYwuf9
6QCglgz9erNa/uG/2ScjCGsAbIRcsEU=
=Fzgm
-----END PGP SIGNATURE-----
Received on 2010-03-02 14:04:16 CET

This is an archived mail posted to the Subversion Dev mailing list.