RE: svn commit: r1586947 - in /subversion/trunk/subversion/libsvn_fs_fs: dag.c dag.h tree.c

From: Bert Huijben <bert_at_qqmail.nl>
Date: Mon, 14 Apr 2014 14:54:44 +0200

Can you please improve your log message to tell what you actually did here.

You improved the code flow within some private parts of fsfs to avoid creating errors.

What your log message tells to somebody that is reading the log message for review:

I read the message as

* â€œHUGE PERFORMANCE IMPROVEMENT FOR MULTITHREADING !!!â€

* â€œPool creation in error production is expensive and should be avoidedâ€

So youâ€™re telling that you make things less expensive, as the pool creation is so expensiveâ€¦

But in fact you are not changing any of this, but are avoiding gettext, which doesnâ€™t use apr pools at all. And the pools arenâ€™t in the performance trace.

In the past we tried to use error code only errors for these *expensive* errors. These errors would then only get a (translated) message when their error message was used. That is why all our error codes have default messages.

But why do we use gettext on server processes side in the first place?

We have no way to transfer the locale to the server, so we use the â€œCâ€ locale anyway. Using gettext in the client makes sense, but on the server not so much.

Bert

From: Stefan Fuhrmann [mailto:stefan.fuhrmann_at_wandisco.com]
Sent: maandag 14 april 2014 14:01
To: Bert Huijben
Cc: Subversion Development; commits_at_subversion.apache.org
Subject: Re: svn commit: r1586947 - in /subversion/trunk/subversion/libsvn_fs_fs: dag.c dag.h tree.c

On Sun, Apr 13, 2014 at 6:18 PM, Bert Huijben <bert_at_qqmail.nl <mailto:bert_at_qqmail.nl> > wrote:

> -----Original Message-----
> From: stefan2_at_apache.org <mailto:stefan2_at_apache.org> [mailto:stefan2_at_apache.org <mailto:stefan2_at_apache.org> ]
> Sent: zondag 13 april 2014 11:45
> To: commits_at_subversion.apache.org <mailto:commits_at_subversion.apache.org>
> Subject: svn commit: r1586947 - in
> /subversion/trunk/subversion/libsvn_fs_fs: dag.c dag.h tree.c
>
> Author: stefan2
> Date: Sun Apr 13 09:45:03 2014
> New Revision: 1586947
>
> URL: http://svn.apache.org/r1586947
> Log:
> Improve MT scalability of the FSFS DAG traversal code.
>
> Error objects are a very expensive way to control the control flow
> as they carry their own pools, created from a thread-safe root pool.
>
> dag_open should not return an error to open_path if the dirent
> cannot be found, pass a NULL node back instead. This eliminates
> about 50% of all transitional error objects during log-y operations.
> As there are only 2 callers of dag_open, they are easy to adapt.

Unless there are a lot of string creations to create a vey specific error message, I've never seen a construction of svn_error_t * in any performance traces...

As the log message said: This code was abusing

error messages for non-erroneous situations (point 1:

really bad design) with 10s of thousands of errors
being created, consumed and re-created (point 2:

inefficient).

The third point is that the implicit dcigettext call as
well as the apr_pool_t creation and destruction for
those error objects require process-wide synchronization.
That limits scalability. I'm currently looking into various

scenarios where a few concurrent requests are already

enough to hog the server or even make it go OOM.

Unless you have some numbers to prove that this helps on more setups than yours, I don't see this as a *single reason* to change current behavior.

If you insist:

Server-side of 'svn log -v -g $tsvn_repo/trunk' (25k revs total in repo)

%Ir #called function

3.72 14062 svn_error_createf.constprop.259

2.72 9370 dcgettext

0.45 14064 svn_error_clear

(plus various small suff)

So, roughly 7% of the instructions are being wasted on
bad interface design. Now imagine what happens to repos
with somewhat more complicated merge histories.

And these numbers do *not* cover the OS sync. overhead.

There could be valid other reasons of course.

Such as the ones mentioned in the log message.

This looks like another case of micro-optimizing certain code which performance critical... I think there is more to gain with other kinds of optimizations, like changing total algorithms to scale better.

Last time I used a different hashing *algorithm*, you still
called it a micro-optimization hoping that a compiler could
do the job that I did in code ...

It wouldn't be the first case where you optimized code that shouldn't have been executed in the first place.

So, did you review that code in the first place?

Note that there is still an open thread on dev_at_s.a.o <mailto:dev_at_s.a.o> about reverting a lot of changes like this in fsfs for guaranteed stability. I would say this falls in this same discussion if we decide to revert those.

There has been a thread about 1.9.0alpha1 (multiple threads,
actually). I will start one about FSFS format7 by the end of
this week.

Looking further in your patch makes it appear that this is just a minor refactoring...

Hm. The obvious strategy here would be "read first, post later",
don't you agree?

I know that you work really hard and I fully appreciate

your feedback (and all your code contributions). Just

don't let the stress get the better of you ... ;)

Not sure if the summary really makes sense for that as I don't think it will provide a measurable performance improvement, while the summary makes it appear that it resolves major multithreading problems.

As Julian already pointed out, the initial sentence ("lemma"
in Wiki-speak) may be somewhat of an overstatement. But

I'm in the process of figuring out why some operations e.g.
scale better in Apache than svnserve. This is one of the
things that came up. Now, both are faster with svnserve

seeing a slightly larger improvement.

-- Stefan^2.
Received on 2014-04-14 14:55:45 CEST

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]