Re: Performance Results on Windows

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Wed, 13 Aug 2014 22:45:20 +0200

On Wed, Aug 13, 2014 at 8:21 PM, Ivan Zhakov <ivan_at_visualsvn.com> wrote:

> On 23 July 2014 17:19, Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
> wrote:
> > Updated and final results:
> >
> > * svnadmin dump results now included, f6 / f7 is +/-20%
> >
> > * fixed anomaly with ra_serf, results consistent with previous findings
> > 'null-export' tests have been rerun and old results been replaced
> >
> > * added a page on how 'null-export' reacts on cache configuration
> > to get a better picture of how cache size, ra layer and block-read
> > interact when caches are hot. Data for small or cold caches has
> > already been in covered in other tests.
> >
> I just want to note that your conclusions don't point the fact that
> there is performance degradation in many case:
>

Thank you for taking the time to look at the detailed results.

> 1. 'svn log -v' for bsd-nopack repository over svn:// in 'FAST'
> configuration is 51% slower
>

That is basically the rarest system state that you may be in:

* It is a repeated log request (OS caches are hot)
* The servers have been properly configured (caches on and large enough)
* Yet, the SVN caches are cold because you just restarted the server

How often do you restart your server application?

In the "fast" config, you will rarely hit the "only OS
is hot" case on Windows. There is only one, large
server process (i.e. OS caches are not much larger)
and if data can't be served from its caches, it will
often not be found in the OS caches as well.

If you take a look at the absolute execution times,
you will see that even at an 90% OS hit rate, the
cold read latency will dominate the total execution
time.

Finally, people that care enough to run a "fast" config
can also be expected to pack their repos when we
tell them that it will boost their performance. They
will then end up with 3.5s (f7 packed) instead of 10.2s
(f6 non-packed) in your edge case system state.

> 2. 'svn export' for ruby-nopack repository over file:// in 'FAST'
> configuration is 23% slower
>

I assume that you are referring to the "working copy
on server" test case. As you can see, there is a more
than 20% variation (red marker) the individual
measurements. Simply compare the "hot OS" and
"hot SVN" values - they should be identical for file://
The undisturbed null-export shows you something
like a 5..10% performance loss in the hot case for
basically all configurations and combinations.

Again F7 becomes faster even over file:// and with
no specific cache settings when you pack your repo
and don't have perfect cache hit rates.

>
> So I ask for unbiased performance tests. As far as I remember, you
> advertised the 2x-10x performance improvement on the hackaton in
> Berlin (that is supposed to be already achieved at the time of
> presentation).

Yes, as a rule of thumb, 2x speed for export and
something like 10x for log -v. I achieved these in
my configurations (packed repo, Linux, local disks
in server). And it is clear that these only apply to
cold reads. SVN cached data is format agnostic.
Looking at the Windows results, they are in line
with those rules of thumb.

And there is also a svnadmin verify quick check
option now that uncovers external corruption on
f7 repos about 100x (fast disk array required, YMMV)
faster than a full check.

I also assumed that Windows might benefit from
things like block-read more than Unix as we safe
of fopen() operations. Later measurements suggested
that the block-read mainly improves cold read
throughput and that OS file API overhead is not
an issue on Windows.

> Then we have found (and fixed) several cases of
> performance degradation. But you didn't tell us about these cases. So
> I consider your performance measurements as biased.
>

That is not what I remember. A big discussion started
once I said "repos need to be packed and servers
properly configured". And that exactly implies that all
other cases may perform worse than before.

During the course of the discussion the following
requirements were added:

* There shall be no significant penalty for non-packed f7 vs. f6.
* There shall be no significant penalty for f7 vs. f6 over file://.
* There shall be no significant penalty for f7 vs. f6 in default
server configurations (mainly cache settings).
* A 20% performance penalty is "significant".

These plus Windows-specific issues that were found
since then is what I fixed in the weeks following the
Berlin hackathon. I suspect that those changes hardly
improved the results of my original configuration
(merging index and revision data does not make
a great difference for packed repos).

However, those changes extended the range of
benefiting configurations to virtually every scenario
where we read data from disk. Getting to 2x speed,
however, requires specific server setup. People that
care less either don't get hurt (non-packed) or get
a freebie (packed) when upgrading to f7.

I'm still getting bad numbers in my tests.

That may well be. Have you read my Wiki page yet?
https://wiki.apache.org/subversion/MeasuringRepositoryPerformance

Without taking specific measures, you are pretty
much bound to produce either unfair or unrealistic
test data. I spent a few days just to create the tools
(links into our repo are in the wiki) and methodology
that will work in less controlled environments than
my Linux home server setup.

> But obviously, my tests can
> be considered as a biased too because I am strongly against this
> feature for many reasons. That's why we need an unbiased performance
> test.
>

Well, start with the wiki page and tell me whether
you agree the methodology. I might be missing
important aspects.

> I think that the unbiased tester should pay attention to ensure both
> the following:
> 1) there is *significant* performance improvement for some realistic cases
>

> 50% is significant. Cold read is significant.
Hot scenarios are client or network limited
(exception: log over 1Gb network).

> 2) there are no performance degradation in *all the typical Subversion
> usage configurations* (including the worst ones).
>

"No degradation, never" is certainly not a reasonable
requirement. In Berlin, people seemed fine with a
20% penalty *in some cases*. The variation between
individual test runs alone is often higher than that.

I mean, we *know* that f7 repos are 2..5% larger
due to the index data. It is perfectly reasonable to
assume that their performance "baseline" is worse
than f6 by roughly the same amount and some of
the "hot OS" runs show exactly that.

The key is to get significantly faster in many typical
scenarios.

> It seems that CollabNET and other hosting providers possibly have one
> of the worst configuration for log-adressing feature. Multiple users
> access over HTTP to a number of gigabyte-sized repositories (so there
> are no enough memory for enormous caches).

Well, the key would be GB-sized projects (data
transferred during export).

I don't know how I feel about getting CollabNet involved.
My concern is that they simply won't have the time to
do it because we are not talking about "please, would
you run those 2 commands for me?" but rather a two
week effort.

> Authorization is enabled
> (my tests shows that this is important).

Authz is only relevant to the degree that it turns
"log" into "log -v". And it adds a format-independent
constant (reading the file) and proportional overhead
(checking paths) to all ops. The larger the authz file,
the more "blurred" the results will be.

> Apache httpd uses prefork MPM
> (that should eliminate the caches).
>

Only the "hot" cases. It does not eliminate the impact
of e.g. revprop caching during log. The impact of ra_serf
using ~3 processes during export is hard to predict but
some caching should be beneficial (DAG node cache,
in particular).

> Also I'd like to note that your method to achieve 'Cold' state on
> Windows is totally wrong: you're basically allocating *all* available
> memory by 'ClearMemory' tool making OS starving on all resources [1].
>

Is it unfair? It should hurt all repo configurations equally.

How do you know it is starving the OS of all resources?
I simply allocate all free and cache *RAM*. I'm not even
forcing anything to be swapped or removed from the
swapped file cache.

> It's abnormal situation for operating system. So your 'Cold' state
> results are irrelevant.
>

Well, the alternative is to make sure that between
test cycles (i.e. when we come back to the first repo),
enough data has been read to make the caches cool
down far enough.

Since the total repo size is something like 50GB, I can
rerun the cold tests with undefined initial cache state.
The results may simply be more noisy.

-- Stefan^2.
Received on 2014-08-13 22:45:51 CEST

This message: [ Message body ]
Next message: Mark Phippard: "Re: Performance Results on Windows"
Previous message: Ivan Zhakov: "Re: Performance Results on Windows"
In reply to: Ivan Zhakov: "Re: Performance Results on Windows"
Next in thread: Mark Phippard: "Re: Performance Results on Windows"
Reply: Mark Phippard: "Re: Performance Results on Windows"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]