On Sun, Apr 13, 2014 at 6:18 PM, Bert Huijben <bert_at_qqmail.nl> wrote:
>
>
> > -----Original Message-----
> > From: stefan2_at_apache.org [mailto:stefan2_at_apache.org]
> > Sent: zondag 13 april 2014 11:45
> > To: commits_at_subversion.apache.org
> > Subject: svn commit: r1586947 - in
> > /subversion/trunk/subversion/libsvn_fs_fs: dag.c dag.h tree.c
> >
> > Author: stefan2
> > Date: Sun Apr 13 09:45:03 2014
> > New Revision: 1586947
> >
> > URL: http://svn.apache.org/r1586947
> > Log:
> > Improve MT scalability of the FSFS DAG traversal code.
> >
> > Error objects are a very expensive way to control the control flow
> > as they carry their own pools, created from a thread-safe root pool.
> >
> > dag_open should not return an error to open_path if the dirent
> > cannot be found, pass a NULL node back instead. This eliminates
> > about 50% of all transitional error objects during log-y operations.
> > As there are only 2 callers of dag_open, they are easy to adapt.
>
> Unless there are a lot of string creations to create a vey specific error
> message, I've never seen a construction of svn_error_t * in any performance
> traces...
>
As the log message said: This code was abusing
error messages for non-erroneous situations (point 1:
really bad design) with 10s of thousands of errors
being created, consumed and re-created (point 2:
inefficient).
The third point is that the implicit dcigettext call as
well as the apr_pool_t creation and destruction for
those error objects require process-wide synchronization.
That limits scalability. I'm currently looking into various
scenarios where a few concurrent requests are already
enough to hog the server or even make it go OOM.
>
> Unless you have some numbers to prove that this helps on more setups than
> yours, I don't see this as a *single reason* to change current behavior.
>
If you insist:
Server-side of 'svn log -v -g $tsvn_repo/trunk' (25k revs total in repo)
%Ir #called function
3.72 14062 svn_error_createf.constprop.259
2.72 9370 dcgettext
0.45 14064 svn_error_clear
(plus various small suff)
So, roughly 7% of the instructions are being wasted on
bad interface design. Now imagine what happens to repos
with somewhat more complicated merge histories.
And these numbers do *not* cover the OS sync. overhead.
>
> There could be valid other reasons of course.
>
Such as the ones mentioned in the log message.
> This looks like another case of micro-optimizing certain code which
> performance critical... I think there is more to gain with other kinds of
> optimizations, like changing total algorithms to scale better.
Last time I used a different hashing *algorithm*, you still
called it a micro-optimization hoping that a compiler could
do the job that I did in code ...
> It wouldn't be the first case where you optimized code that shouldn't have
> been executed in the first place.
>
So, did you review that code in the first place?
Note that there is still an open thread on dev_at_s.a.o about reverting a lot
> of changes like this in fsfs for guaranteed stability. I would say this
> falls in this same discussion if we decide to revert those.
>
There has been a thread about 1.9.0alpha1 (multiple threads,
actually). I will start one about FSFS format7 by the end of
this week.
> Looking further in your patch makes it appear that this is just a minor
> refactoring...
Hm. The obvious strategy here would be "read first, post later",
don't you agree?
I know that you work really hard and I fully appreciate
your feedback (and all your code contributions). Just
don't let the stress get the better of you ... ;)
> Not sure if the summary really makes sense for that as I don't think it
> will provide a measurable performance improvement, while the summary makes
> it appear that it resolves major multithreading problems.
>
As Julian already pointed out, the initial sentence ("lemma"
in Wiki-speak) may be somewhat of an overstatement. But
I'm in the process of figuring out why some operations e.g.
scale better in Apache than svnserve. This is one of the
things that came up. Now, both are faster with svnserve
seeing a slightly larger improvement.
-- Stefan^2.
Received on 2014-04-14 14:01:44 CEST