[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Subversion 1.9.0-dev FSFS performance tests

From: Ivan Zhakov <ivan_at_visualsvn.com>
Date: Mon, 7 Jul 2014 18:58:06 +0400

On 1 July 2014 03:27, Johan Corveleyn <jcorvel_at_gmail.com> wrote:
> On Mon, Jun 30, 2014 at 6:21 PM, Ivan Zhakov <ivan_at_visualsvn.com> wrote:
>> On 30 June 2014 18:51, Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> wrote:
>>> On Mon, Jun 30, 2014 at 4:06 PM, Ivan Zhakov <ivan_at_visualsvn.com> wrote:
>>>> On 19 June 2014 14:21, Ivan Zhakov <ivan_at_visualsvn.com> wrote:
>>>> > Hi,
>>>> >
>>>> > I've performed several FSFS performace tests using latest Subversion
>>>> > from trunk_at_r1602928.
>>>> >
>>>> I've re-ran my FSFS performance tests with trunk_at_r1605444 using latest
>>>> fsfs7 performance fixes including combining indexes to revision files.
>> [..]
>>> Also, it seems that some of these tests are run from hot
>>> caches - causing a lot of variation and making comparison
>>> pointless. An extreme case:
>>> ptime 1.0 for Win32, Freeware - http://www.pc-tools.net/
>>> Copyright(C) 2002, Jem Berkes <jberkes_at_pc-tools.net>
>>> === "svn log http://localhost/svn/ruby-fsfs6-unpacked >nul" ===
>>> Execution time: 216.064 s
>>> ...
>>> Execution time: 13.268 s
>>> ...
>>> Execution time: 18.061 s
>> Yes, I use hot caches and already noted this in my report: "Every test
>> was run 3 times and only two latest used"
>> I don't see the reason to test on cold disk caches because I assume
>> that caches in the real servers are somewhat hot. No matter how it's
>> complex to compare the results on hot disk caches. For me, log
>> addressing feature is definitely useless if it slower on hot disk
>> caches.
> Innocent bystander's opinion: I appreciate your critical look at the
> performance of fsfs7, Ivan, but I disagree. I think in lots of
> real-world cases cold cache performance is much more important than
> hot cache performance.
> Especially in a big / busy repository and for use cases such as "svn
> log". I usually don't ask the same "svn log" three times in a row.
> User A might request "svn log" of some file or subtree, user B then
> asks for another subtree, and so on.

Caches on real-world server that I monitor are somewhat warm/hot for
busy repositories. The disk metadata be already cached, at least. But
I agree that querying "svn log" three times in a row for the same path
is too optimistic.

However, note that sequential updates for the same path is quite usual
for busy repositories.

> It's the performance of that first (and usually only) request that matters most to me.
I think we should care about the overall performance. On the other
hand, performance optimization should be targeted to the real
cases where users experience performance bootlenecks.

> Same for export: different users export different parts of the
> repository all the time. Not the same part a couple of times in a row
> (except perhaps right after some release / milestone / ...).
> So I for one am more interested in cold-cache performance
> improvements, even if they come at a (hopefully small) hot-cache
> performance cost.
> OTOH, I do wonder about these (hot-cache) performance regressions ...
> maybe they can still be improved?
My technical opinion that FSFS7/log addressing is slower by design,
because it's doing more (read index, then read data instead of just
read data) and only caching makes them comparable on performance to
FSFS6 repositories.

Ivan Zhakov
Received on 2014-07-07 16:58:56 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.