My company is currently considering moving to Subversion, and I have
been doing various tests and investigation to consider the feasibility
of this move.
Generally everything has looked pretty rosy (apart from Subversion
graphing, but that's an entirely different discussion).
That is until I started to consider a multi-site repository mirroring
Lets say (for a start) we have an office in the UK, one in France and
another in India. Let's additionally say that the master server is in
the UK and all the other sites are connected by a hosted WAN system.
Each remote site then has a replicated slave, using Apache and
transparent proxying of write operations back to the master.
The particular test that showed worrying problems was the import of a
directory tree full of Linux header header files for multiple
architectures - a pretty common occurrence in the company's line of
work. [Note: though it could be any tree with many files in it]
The directory tree contained 3739 files totalling 29MiB.
An import from the UK to the UK master takes: 38s
An import from France to the France slave takes: 3m48s
An import from France to the UK master takes: 3m46s
An import from India to the UK master: 27m57s !!
So, clearly an issue. The counter to this is that commits are relatively
uncommon compared to checkouts, and furthermore that commits of over
3000 files are fairly uncommon. However, I don't think this quite covers
the difficulty - engineers _will_ want to do large checkins on remote sites.
I retried the test using ra_svn and the India import took just _22s_!
OK, so on the surface ra_svn is clearly the solution...except you can't
do transparent proxying of reads/writes to a local slave with
ra_svn...so you're going have to get to the situation of doing general
read operations from the local slave, either with ra_svn or ra_neon and
then when you do a (large) commit have to switch to using ra_svn. This
will either require a deal of effort for the individual engineer, or a
script to perform this for them.
So my question is: Is there any solution to this? I've just had a read
of the ra_serf README, and _checkout_ parallelisation/pipelining has
been implemented, but _commit_ pipelining has not yet been implemented
and is only on the "Nice to haves" list.
So, is the "nice to have" in ra_serf likely to mean it won't be
implemented any time soon? Clearly commit pipelining is possible,
because ra_svn already does it...
Right, I'll leave it at that for now.
John Beranek To generalise is to be an idiot.
http://redux.org.uk/ -- William Blake
To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_subversion.tigris.org].
Received on 2009-06-24 16:52:12 CEST