[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Expected speed of commit over HTTP?

From: Paul Hammant <paul_at_hammant.org>
Date: Fri, 7 Jul 2017 19:34:28 -0400

Great insights, Stefan.

So 'cp' with that sync, timed as you outline is averaging 3x slower than
the cp as I had it before.

Thus, that same cp with sync tacked on, is 3x faster that the curl PUT over
HTTP.

That's within the place where any speedup spent on the handoff of between
apache modules would be $100K of programming for a 15% speedup - not worth
it. At not worth it because mine is an esoteric use case and (say) Goldman
don't have Svn in a high frequency trading pipeline.

Case closed?

Oh, Philip, *yes you were right* - well done for spotting my *cough*
deliberate copy/paste mistake. What I'm using is in fact:

    dd if=/dev/zero bs=1M count=500 2>/dev/null | openssl enc -rc4-40 -pass
pass:weak > six

- Paul

On Fri, Jul 7, 2017 at 2:39 PM, Stefan Fuhrmann <stefanfuhrmann_at_alice-dsl.de
> wrote:

> On 07.07.2017 01:10, Paul Hammant wrote:
>
> With autorevision set to 'on' and curl:
>
> Reverence speed for boot drive to USB3 spinning platter 4TB thing:
>
> paul_at_paul-HiBox:~$ time cp /home/paul/clientDir/seven
> /media/paul/sg4t/sevenb
> real 0m1.539s
>
>
> That isn't exactly accurate - you write to the OS file
> cache and not to disk (yet). Subversion instructs the
> OS to flush the relevant parts its cache to make sure
> committed data will survive power failure etc.
>
> For a fair comparison, you should do something like:
>
> $ sync
> $ time (cp from/path to/path && sync)
>
> FSFS actually calls sync ~7 times per commit (store
> the revprops, the newest head revision number, ...).
> In 1.10, svnadmin has an option to temporarily disable
> this for things like "svnadmin load".
>
> Maybe, we could turn this into a fsfs.conf option for
> sufficiently safe / replicated environments. Not sure
> how I feel about that.
>
> Create a new 501MB file on Svn Server:
>
>
> paul_at_paul-HiBox:~$ time curl -u paul:myPassword
> http://192.168.1.178/svn/svnRepo1/twoB --upload-file
> /home/paul/clientDir/seven
> <title>201 Created</title>
> real 0m49.442s
>
> *I ran that a couple more times and it was up there at 50s*
>
>
> Data compression via deflate (zip algorithm) is typically
> 20MB/s on a slower CPU. That accounts for most of the
> difference you see in the "compression off".
>
> You might try using compression-level=1. It still gives you
> some space savings - particularly in the "stupidly redundant
> data" cases - but is 3 .. 4 times faster than the default.
>
> Needlessly overwrite 501MB file (file is unchanged) on Svn Server:
>
>
> paul_at_paul-HiBox:~$ time curl -u paul:myPassword
> http://192.168.1.178/svn/svnRepo1/twoB --upload-file
> /home/paul/clientDir/seven
> real 0m13.426s
>
>
> This reads the old data, decompresses it, verifies its
> checksum and calculates the checksum of the incoming
> data twice (MD5 and SHA1).
>
> On a slow-ish CPU, you get something like
>
> * 100 MB/s decompress
> * 250 MB/s MD5 checksum verification
> * 250 MB/s MD5 checksum new contents
> * 200 MB/s SHA1 checksum new contents
>
> Roughly 12s for 500MB. A quick CPU will be twice as fast.
>
>
> Change the compression-level=0
>
> paul_at_paul-HiBox:~$ sudo nano /media/paul/sg4t/svnParent/svn
> Repo1/db/fsfs.conf
>
> Create a new 501MB file on Svn Server:
>
>
> paul_at_paul-HiBox:~$ time curl -u paul:myPassword
> http://192.168.1.178/svn/svnRepo1/twoC --upload-file
> /home/paul/clientDir/seven
> <title>201 Created</title>
> real 0m15.312s
>
> Yay - a modest speed boost!!!
>
>
> That would be about 4..5s for the checksum calculation
> and the remaining time for the disk write (80..90MB/s).
>
>
> Restart Apache - which I didn't do before:
>
>
> paul_at_paul-HiBox:~$ systemctl restart apache2
>
> Create a new 501MB file on Svn Server:
>
>
> paul_at_paul-HiBox:~$ time curl -u paul:myPassword
> http://192.168.1.178/svn/svnRepo1/twoD --upload-file
> /home/paul/clientDir/seven
> <title>201 Created</title>
> real 0m14.925s
>
>
> Conclusion:
>
> With compression-level=5 (default), there's is a 1:33 cp to curl-PUT ratio.
> With compression-level=0, there's is a 1:10 cp to curl-PUT ratio.
>
>
> If you want to measure pure CPU / protocol overhead,
> put your repository on a RAM disk. Most machines
> should be able to handle a <1GB repo.
>
>
> Is there there are other alluring settings, such as...
>
> enable-rep-sharing = false
>
> This kicks in only after the new data has been written
> (i.e. all has come in) and *then* it is found to be redundant.
> So, this option may safe quite a bit of disk space but does
> virtually nothing to reduce the time taken.
>
> enable-dir-deltification = false
>
> That *can* actually help, iff you have deeply nested,
> small directories. In that case, it will reduce I/O during
> commit. The effect on read access can be all over the place.
>
> ... but they didn't yield an improvement.
>
> Thanks for all the replies, gang.
>
>
> If you need to set up a repository for large, binary data,
> use the svn: protocol, tune the repo as previously discussed,
> pick a fast server CPU and you should be able to reach up
> to 200MB/s over a single network connection in 1.9 with
> 'svn import'. 'svn commit' takes about twice as long but
> that could be fixed eventually as well.
>
> -- Stefan^2.
>
>
Received on 2017-07-08 01:34:34 CEST

This is an archived mail posted to the Subversion Dev mailing list.