On Mon, Nov 12, 2012 at 6:20 PM, Philip Martin
<philip.martin_at_wandisco.com> wrote:
> We at WANdisco had a discussion about serf in 1.8 today; I'd said I
> summarise to the list.
>
> One concern is the impact on server performance, particularly older
> servers, of serf as the only client. Issue
> http://subversion.tigris.org/issues/show_bug.cgi?id=3980 has some
> figures. The CPU load has improved recently, serf is no longer such a
> CPU burden on the server. The bandwidth is still a concern because serf
> responses default to uncompressed data while neon responses default to
> svndiff compression. Using mod_deflate is the proposed solution and
> that does address the bandwidth but at the expense of CPU. Also there
> has been an important fix to mod_deflate that is not yet in Apache 2.2
> which may prevent some people enabling mod_deflate (although I did use
> 2.2 and mod_deflate in my tests).
>
> A caching proxy is another solution to the server load. Caching using a
> proxy like squid does work but also has problems. When authn is in
> operation it appears that caching is either hard or not allowed, squid
> certainly doesn't cache. mod_cache might be better but is probably more
> complicated to setup and manage. Perhaps we need to document a
> recommended configuration?
>
> Another problem for proxy caching is that no released Subversion
> currently sends the Vary: header that is required for caching to work
> reliably. I can provoke this error by using squid and Subversion
> 1.6.19:
>
> $ svn up wc1 --config-option servers:global:http-proxy-host=::1 --config-option servers:global:http-proxy-port=3128
> Updating 'wc1':
> ../src/subversion/svn/update-cmd.c:168: (apr_err=200003)
> ../src/subversion/libsvn_client/update.c:639: (apr_err=200003)
> ../src/subversion/libsvn_client/update.c:579: (apr_err=200003)
> ../src/subversion/libsvn_client/update.c:440: (apr_err=200003)
> ../src/subversion/libsvn_wc/adm_crawler.c:858: (apr_err=200003)
> ../src/subversion/libsvn_ra_serf/update.c:2529: (apr_err=200003)
> ../src/subversion/libsvn_ra_serf/util.c:2057: (apr_err=200003)
> ../src/subversion/libsvn_ra_serf/util.c:2038: (apr_err=200003)
> ../src/subversion/libsvn_subr/stream.c:162: (apr_err=200003)
> ../src/subversion/libsvn_delta/svndiff.c:877: (apr_err=200003)
> ../src/subversion/libsvn_delta/text_delta.c:823: (apr_err=200003)
> svn: E200003: Delta source ended unexpectedly
>
> I wonder if serf could detect the SVN_ERR_INCOMPLETE_DATA error and
> check if Vary: header was received? It could then suggest a possible
> reason for the error? Perhaps it could even retry with a no-cache
> header? Or should we provide a way for users to cause serf to always
> send no-cache? I wonder if mod_headers could be used to add the missing
> Vary headers? I'm guessing that the absence of the Vary header would
> also prevent the use of mod_cache.
>
> Another concern is the increased server logging due to the large
> increase in the number of requests. A 1.8 server does better than older
> servers, about 50% fewer requests on checkout, but there is still a big
> increase over neon. No solution other than "it happens".
>
> The memory leak with older servers, issue
> http://subversion.tigris.org/issues/show_bug.cgi?id=4194 will be a big
> problem if it is not fixed. CMike has worked on this recently and I
> think the problem is understood but the solution is not yet known.
>
> Serf is much better than when 1.7 was branched but is it suitable as the
> only client in 1.8?
In discussing Serf with our operations team a few months ago, they
raised an additional concern that is not included here and that is the
increase in connections to the HTTP server. This might be another
item we want to document. Their concerns were of the tuning nature
and the impact the additional connections would have on the years of
tuning knowledge they have amassed. In their case, the concern is the
impact on the authentication and authorization framework. Ours
involves an app server and database connections, so the increase in
connections needs to be configured through all those layers. I would
guess that things like LDAP authentication could have similar issues,
thought I would assume the caching in mod_ldap should prevent the
additional connections from hitting the LDAP server.
We did some testing in our lab, and the KeepAlive settings help a lot
here. Without any KeepAlive, then obviously every Serf request on
every connection needed to be re-authenticated. With KeepAlive on and
the connection limit set high enough then it was reduced to just the 4
connections that Serf opens to the server. So for our case, making
sure this was in place on our server, and then doing some more testing
to figure out the increases we needed to make in our backends was the
main issue to deal with.
They were not concerned about the logs impacting the systems, in the
sense that they have log rotation procedures in place and already deal
with large log files. The main concern is the increase in long term
storage requirements for the logs, and the costs that will come with
that. I expect I will hear from the in the future when they start to
see just how big the log file size increase is going to be.
I suppose a possible solution would be an Apache directive on the
server, that caused it to only send Neon-style REPORT responses and
for the Serf client to handle them. That would allow the server
administrator to be able to tune the server the way they wanted and
for the clients to respect those options.
--
Thanks
Mark Phippard
http://markphip.blogspot.com/
Received on 2012-11-13 14:59:03 CET