Re: FSFS format7 status and first results

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Sat, 16 Feb 2013 22:30:29 +0100

On Sat, Feb 16, 2013 at 5:47 PM, Mark Phippard <markphip_at_gmail.com> wrote:

> On Sat, Feb 16, 2013 at 4:52 AM, Stefan Fuhrmann
> <stefan.fuhrmann_at_wandisco.com> wrote:
> > Hey all,
> >
> > Just to give you an update on what is going on that branch,
> > here a few facts and numbers. Bottom line is that there is
> > still a lot to do but the basic assumptions proved correct and
> > significant benefits can already be demonstrated.
> >
> > * about 20% of the coding is done so far
> > * some core features implemented:
> > logical addressing, reorg upon pack, block read
>
> What do you mean by pack here? Is it svnadmin pack?

svnadmin pack

> Is that in any way an essential part of the performance boost?

Yes. It will places items (noderevs, representations, change lists)
next to each other when they will likely be requested shortly
after one another. For instance, try to concatenate all elements
of a deltification chain.

> Or are your format7 repositories always packed?
>

They are not. Unpacked revisions will see a performance hit from
reading the two extra index files per revision and a boost from
block-read which will often fetch the whole revision with a single
I/O operation.

> * format 7 repos are ~3x faster due to reduced I/O
> > * format 6 repos get faster by ~10%
>
> When you talk about format6 and branch6 and compare to trunk, what do
> you mean? Is that just how the format6 repository, which is already
> in trunk, will fare with improved caching that is in your branch?
>

Yes. Improved caching and optimized reporting order (i.e. in what
order will an export / checkout read the tree) apply to existing format6
repos as well.

> > * format 7 performance hit by cache inefficiency;
> > will be addressed by 'container' objects plus a
> > change in the membuffer implementation
> > * on-disk format still in the flux and will remain
> > so for a while
> >
> > Tests were run against a local copy of the TSVN repo with
> > revprop compression, directory deltification and property
> > deltification enabled to bring format 6 structurally as
> > close to format 7 defaults as possible.
> >
> > All values are given in user-time triples for trunk_at_1446787
> > on format 6, fsfs-format7_at_1446862 on format 6 and format 7.
> > "hot" runs mean "from OS cache"; svnserve was restarted
> > before every test run.
> >
> > $ time svnadmin verify -M 4000 -q $repo
> >
> > medium trunk / branch6 / branch7
> > USB cold 270.5s / 246.3s / 104.4s = 1.00 : 1.10 : 2.59
> > USB hot 248.0s / 191.0s / 60.8s = 1.00 : 1.30 : 4.08
> > SSD cold 66.1s / 62.0s / 59.4s = 1.00 : 1.07 : 1.11
> > SSD hot 63.2s / 60.0s / 57.1s = 1.00 : 1.05 : 1.11
> >
> > $ time svnbench null-export svn://localhost/tsvn/trunk -q
>
> Any reason you test with svn:// and not http://. I feel like the
> latter is the most widely used server by a wide margin.
>

Quite a number of reasons:

* easy setup
* minimal overhead (I want to get as close to measuring pure
FS layer performance as possible)
* easy to debug and profile

> > USB cold 44.29s / 39.63s / 16.24s = 1.00 : 1.12 : 2.73
> > USB hot 10.68s / 10.25s / 3.78s = 1.00 : 1.04 : 2.83
> > SSD cold 5.72s / 5.00s / 3.75s = 1.00 : 1.14 : 1.53
> > SSD hot 2.37s / 2.38s / 3.21s = 1.00 : 1.00 : 0.74
> >
> > $ time svnbench null-log svn://localhost/tsvn/trunk -v -q
> >
> > USB cold 54.36s / 50.17s / 8.73s = 1.00 : 1.06 : 6.11
> > USB hot 43.64s / 36.46s / 3.52s = 1.00 : 1.20 :12.40
> > SSD cold 9.32s / 10.60s / 3.22s = 1.00 : 0.88 : 2.89
> > SSD hot 2.36s / 2.28s / 2.88s = 1.00 : 1.04 : 0.82
> >
> > $ time svnbench null-log svn://localhost/tsvn/trunk -v -g -q
> >
> > USB cold 98.02s / 87.01s / 23.74s = 1.00 : 1.13 : 4.13
> > USB hot 69.88s / 57.14s / 7.88s = 1.00 : 1.22 : 8.87
> > SSD cold 8.35s / 10.50s / 8.16s = 1.00 : 0.80 : 1.02
> > SSD hot 5.94s / 5.72s / 6.39s = 1.00 : 1.04 : 0.93
> >
> > Tests have been conducted with maximum optimization:
> >
> > ./configure --disable-shared --disable-debug --enable-optimize \
> > --without-berkeley-db -without-serf CUSERFLAGS='-march=native'
>
> Do we have, or will we have, a document or wiki that suggest
> optimization flags for packagers?
>

'--enable-optimize' is new in 1.8. It should probably be documented
somewhere but I'm not sure how safe it is to *recommend* it to
packagers. The optimizations are quite aggressive and might break
unclean code.

I used it in conjunction with '-march=native' to minimize CPU time
vs. I/O time. It saved 3 .. 5% of CPU cycles in my tests.

-- Stefan^2.

-- 
Certified & Supported Apache Subversion Downloads:
*
http://www.wandisco.com/subversion/download
*

Received on 2013-02-16 22:31:05 CET

This message: [ Message body ]
Next message: Stefan Fuhrmann: "Re: BDB vs FSFS - OMG!"
Previous message: Mark Phippard: "Re: FSFS format7 status and first results"
In reply to: Mark Phippard: "Re: FSFS format7 status and first results"
Next in thread: Branko ÄŒibej: "Re: FSFS format7 status and first results"
Reply: Branko ÄŒibej: "Re: FSFS format7 status and first results"
Reply: Mark Phippard: "Re: FSFS format7 status and first results"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]