Re: Expected speed of commit over HTTP?

From: Stefan Fuhrmann <stefanfuhrmann_at_alice-dsl.de>
Date: Fri, 7 Jul 2017 20:39:37 +0200

On 07.07.2017 01:10, Paul Hammant wrote:
> With autorevision set to 'on' and curl:
>
> Reverence speed for boot drive to USB3 spinning platter 4TB thing:
>
> paul_at_paul-HiBox:~$ time cp /home/paul/clientDir/seven
> /media/paul/sg4t/sevenb
> real0m1.539s
>

That isn't exactly accurate - you write to the OS file
cache and not to disk (yet). Subversion instructs the
OS to flush the relevant parts its cache to make sure
committed data will survive power failure etc.

For a fair comparison, you should do something like:

$ sync
$ time (cp from/path to/path && sync)

FSFS actually calls sync ~7 times per commit (store
the revprops, the newest head revision number, ...).
In 1.10, svnadmin has an option to temporarily disable
this for things like "svnadmin load".

Maybe, we could turn this into a fsfs.conf option for
sufficiently safe / replicated environments. Not sure
how I feel about that.

> Create a new 501MB file on Svn Server:
>
>
> paul_at_paul-HiBox:~$ time curl -u paul:myPassword
> http://192.168.1.178/svn/svnRepo1/twoB
> <http://192.168.1.178/svn/svnRepo1/twoB> --upload-file
> /home/paul/clientDir/seven
> <title>201 Created</title>
> real0m49.442s
>
> /I ran that a couple more times and it was up there at 50s/
>
>
Data compression via deflate (zip algorithm) is typically
20MB/s on a slower CPU. That accounts for most of the
difference you see in the "compression off".

You might try using compression-level=1. It still gives you
some space savings - particularly in the "stupidly redundant
data" cases - but is 3 .. 4 times faster than the default.

> Needlessly overwrite 501MB file (file is unchanged) on Svn Server:
>
>
> paul_at_paul-HiBox:~$ time curl -u paul:myPassword
> http://192.168.1.178/svn/svnRepo1/twoB
> <http://192.168.1.178/svn/svnRepo1/twoB> --upload-file
> /home/paul/clientDir/seven
> real0m13.426s
>

This reads the old data, decompresses it, verifies its
checksum and calculates the checksum of the incoming
data twice (MD5 and SHA1).

On a slow-ish CPU, you get something like

     * 100 MB/s decompress
     * 250 MB/s MD5 checksum verification
     * 250 MB/s MD5 checksum new contents
     * 200 MB/s SHA1 checksum new contents

Roughly 12s for 500MB. A quick CPU will be twice as fast.

>
> Change the compression-level=0
>
> paul_at_paul-HiBox:~$ sudo nano
> /media/paul/sg4t/svnParent/svnRepo1/db/fsfs.conf
>
> Create a new 501MB file on Svn Server:
>
>
> paul_at_paul-HiBox:~$ time curl -u paul:myPassword
> http://192.168.1.178/svn/svnRepo1/twoC
> <http://192.168.1.178/svn/svnRepo1/twoC> --upload-file
> /home/paul/clientDir/seven
> <title>201 Created</title>
> real0m15.312s
>
> Yay - a modest speed boost!!!

That would be about 4..5s for the checksum calculation
and the remaining time for the disk write (80..90MB/s).
>
>
> Restart Apache - which I didn't do before:
>
>
> paul_at_paul-HiBox:~$ systemctl restart apache2
>
> Create a new 501MB file on Svn Server:
>
>
> paul_at_paul-HiBox:~$ time curl -u paul:myPassword
> http://192.168.1.178/svn/svnRepo1/twoD
> <http://192.168.1.178/svn/svnRepo1/twoD> --upload-file
> /home/paul/clientDir/seven
> <title>201 Created</title>
> real0m14.925s
>
>
> Conclusion:
>
> With compression-level=5 (default), there's is a 1:33 cp to
> curl-PUT ratio.
> With compression-level=0, there's is a 1:10 cp to curl-PUT ratio.
>

If you want to measure pure CPU / protocol overhead,
put your repository on a RAM disk. Most machines
should be able to handle a <1GB repo.
>
> Is there there are other alluring settings, such as...
>
> enable-rep-sharing = false
>
This kicks in only after the new data has been written
(i.e. all has come in) and *then* it is found to be redundant.
So, this option may safe quite a bit of disk space but does
virtually nothing to reduce the time taken.
>
> enable-dir-deltification = false
>
That *can* actually help, iff you have deeply nested,
small directories. In that case, it will reduce I/O during
commit. The effect on read access can be all over the place.

> ... but they didn't yield an improvement.
>
> Thanks for all the replies, gang.

If you need to set up a repository for large, binary data,
use the svn: protocol, tune the repo as previously discussed,
pick a fast server CPU and you should be able to reach up
to 200MB/s over a single network connection in 1.9 with
'svn import'. 'svn commit' takes about twice as long but
that could be fixed eventually as well.

-- Stefan^2.
Received on 2017-07-07 20:39:56 CEST

This message: [ Message body ]
Next message: Paul Hammant: "Re: Expected speed of commit over HTTP?"
Previous message: Evgeny Kotkov: "Re: [PATCH] Tweak the SHA-1 FAQ entry"
In reply to: Paul Hammant: "Re: Expected speed of commit over HTTP?"
Next in thread: Paul Hammant: "Re: Expected speed of commit over HTTP?"
Reply: Paul Hammant: "Re: Expected speed of commit over HTTP?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]