On 17.02.2016 15:33, Ivan Zhakov wrote:
> On 16 February 2016 at 00:47, <stefan2_at_apache.org> wrote:
>> Author: stefan2
>> Date: Mon Feb 15 21:47:00 2016
>> New Revision: 1730617
>> URL: http://svn.apache.org/viewvc?rev=1730617&view=rev
>> Continue work on the svn_repos_get_logs4 to svn_repos_get_logs5 migration:
>> Switch the last svn_fs_paths_changed2 call to svn_fs_paths_changed3.
>> * subversion/libsvn_repos/log.c
>> (fs_mergeinfo_changed): No longer fetch the whole changes list. However,
>> we need to iterate twice for best total performance
>> and we need to minimize FS iterator lifetimes.
> It seems that I would be -1 against this particular change. In the
> current implementation the svn_fs_paths_changed3() is called twice
> that in the worst case will lead to *double read from disk*.
Pick your worst-case behaviour:
(1) Crash the server with OOM.
(2) For each change in a revision, perform a 1-step history lookup
(random I/O, about 2x as much data to read as with (3)).
(3) Read a linear file section twice.
I went with option (3). What is your preference?
> As far as I understand you're relying to the fact that the second call
> will hit the FSFS/FSX cache. But there will be a significant
> performance degradation comparing to the 1.9 implementation in the
> case of cache miss.
On a system without OS-side file cache, 'log -g'
on the repository root would take at most twice
as long as in 1.9. Other operations are not
Running it on any other directory will actually
be faster in many cases with 1.10 because the new
history traversal code no longer reconstructs all
directories up the tree for each revision.
So, hardly anything people will complain about.
> As we are adding more and more of such code, more and more users
> become faced with the significant performance regression (see  and
> other cases).
If you consider  a significant performance regression,
please follow up on it and review the two relevant
backports for 1.9.
> Do you intend to resolve this problem in the future commits? I have
> some obvious solutions in mind, but maybe you also know something
> about this.
The only reasonable alternative is to pick option
(2) and hope that the performance regression in
real-life is acceptable.
>  http://svn.haxx.se/users/archive-2015-12/0060.shtml
Received on 2016-02-19 08:35:46 CET