serf in 1.8

From: Philip Martin <philip.martin_at_wandisco.com>
Date: Mon, 12 Nov 2012 23:20:44 +0000

We at WANdisco had a discussion about serf in 1.8 today; I'd said I
summarise to the list.

One concern is the impact on server performance, particularly older
servers, of serf as the only client. Issue
http://subversion.tigris.org/issues/show_bug.cgi?id=3980 has some
figures. The CPU load has improved recently, serf is no longer such a
CPU burden on the server. The bandwidth is still a concern because serf
responses default to uncompressed data while neon responses default to
svndiff compression. Using mod_deflate is the proposed solution and
that does address the bandwidth but at the expense of CPU. Also there
has been an important fix to mod_deflate that is not yet in Apache 2.2
which may prevent some people enabling mod_deflate (although I did use
2.2 and mod_deflate in my tests).

A caching proxy is another solution to the server load. Caching using a
proxy like squid does work but also has problems. When authn is in
operation it appears that caching is either hard or not allowed, squid
certainly doesn't cache. mod_cache might be better but is probably more
complicated to setup and manage. Perhaps we need to document a
recommended configuration?

Another problem for proxy caching is that no released Subversion
currently sends the Vary: header that is required for caching to work
reliably. I can provoke this error by using squid and Subversion

$ svn up wc1 --config-option servers:global:http-proxy-host=::1 --config-option servers:global:http-proxy-port=3128
Updating 'wc1':
../src/subversion/svn/update-cmd.c:168: (apr_err=200003)
../src/subversion/libsvn_client/update.c:639: (apr_err=200003)
../src/subversion/libsvn_client/update.c:579: (apr_err=200003)
../src/subversion/libsvn_client/update.c:440: (apr_err=200003)
../src/subversion/libsvn_wc/adm_crawler.c:858: (apr_err=200003)
../src/subversion/libsvn_ra_serf/update.c:2529: (apr_err=200003)
../src/subversion/libsvn_ra_serf/util.c:2057: (apr_err=200003)
../src/subversion/libsvn_ra_serf/util.c:2038: (apr_err=200003)
../src/subversion/libsvn_subr/stream.c:162: (apr_err=200003)
../src/subversion/libsvn_delta/svndiff.c:877: (apr_err=200003)
../src/subversion/libsvn_delta/text_delta.c:823: (apr_err=200003)
svn: E200003: Delta source ended unexpectedly

I wonder if serf could detect the SVN_ERR_INCOMPLETE_DATA error and
check if Vary: header was received? It could then suggest a possible
reason for the error? Perhaps it could even retry with a no-cache
header? Or should we provide a way for users to cause serf to always
send no-cache? I wonder if mod_headers could be used to add the missing
Vary headers? I'm guessing that the absence of the Vary header would
also prevent the use of mod_cache.

Another concern is the increased server logging due to the large
increase in the number of requests. A 1.8 server does better than older
servers, about 50% fewer requests on checkout, but there is still a big
increase over neon. No solution other than "it happens".

The memory leak with older servers, issue
http://subversion.tigris.org/issues/show_bug.cgi?id=4194 will be a big
problem if it is not fixed. CMike has worked on this recently and I
think the problem is understood but the solution is not yet known.

Serf is much better than when 1.7 was branched but is it suitable as the
only client in 1.8?

