[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1586922 - in /subversion/trunk/subversion: include/private/svn_io_private.h include/svn_types.h libsvn_subr/io.c libsvn_subr/stream.c

From: Ivan Zhakov <ivan_at_visualsvn.com>
Date: Fri, 18 Apr 2014 20:21:46 +0400

On 18 April 2014 15:46, Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> wrote:
> On Fri, Apr 18, 2014 at 10:06 AM, Ivan Zhakov <ivan_at_visualsvn.com> wrote:
>> On 14 April 2014 20:11, Ivan Zhakov <ivan_at_visualsvn.com> wrote:
>> > On 13 April 2014 08:40, <stefan2_at_apache.org> wrote:
>> >> Author: stefan2
>> >> Date: Sun Apr 13 04:40:40 2014
>> >> New Revision: 1586922
>> >>
>> >> URL: http://svn.apache.org/r1586922
>> >> Log:
>> >> Speed up file / stream comparison, i.e. minimize the processing
>> >> overhead
>> >> for finding the first mismatch.
>> >>
>> >> The approach is two-sided. Instead of fetching SVN__STREAM_CHUNK_SIZE
>> >> from all sources before comparing data, we start with a much lower 4kB
>> >> and increase the chunk size until we reach SVN__STREAM_CHUNK_SIZE while
>> >> making sure that all reads are naturally aligned. So, we quickly find
>> >> mismatches near the beginning of the file.
>> >>
>> >> On the other end side, we bump the SVN__STREAM_CHUNK_SIZE to 64kB which
>> >> gives better throughput for longer distances to the first mismatch -
>> >> without causing ill effects in APR's memory management.
>> >>
>> >> * subversion/include/svn_types.h
>> >> (SVN__STREAM_CHUNK_SIZE): Bump to 64k and add some documentation on
>> >> the general restrictions for future
>> >> changes.
>> >>
>> > Hi Stefan,
>> >
>> > You effectively reverted my recent fix (r1581296) for high memory
>> > usage with many repositories open at the same time (about 500k per
>> > repository. Also please consider r857671 and discussion before that
>> > commit:
>> > http://svn.haxx.se/dev/archive-2004-11/0123.shtml
>> >
>> > So please revert.
>> >
>> Sorry, may be I was not clear enough, but I was vetoing this change.
>> Given no reply, I'm going to revert it.
> [Been AFK for the last 3 days]
> Don't revert; your veto is based on faulty measurements & assumptions.
> Here the shortlist:
My veto is still my veto, you cannot judge it.

You don't get the point: you changed *global* constant used in
multiple situation for just FSFS optimization. The value for this
constant was discussed long time ago and was reduced for serious
reasons. You changed it just to fix specific FSFS cases. I believe
it's wrong way for coding.

You also made libsvn_wc a bit slower, because while your 'slow start'
compare algorithm may be works fine for FSFS it doesn't work well for
comparing on disk files in WC.

> * SVN__STREAM_CHUNK_SIZE is only to be used with files and
> streams (duh!) - not with data structures that persist in RAM.
> If your use of that constant were to actually effect the size of a
> repo instance, that use would be faulty.
> * parse_context_t is only instantiated in one place,
> svn_config__parse_stream
> and will be destroyed at the end of that function. So, there is no
> persistent memory use from this one.
This depends on allocator and heap behavior.

> * According to *my* measurements, a FSFS instance takes about
> 46kB just after fs_open() and under 64 bit Unix. Of those 46k,
> 64k were used - according to you - for the stream buffer alone ...
Yes, on trunk FSFS instances doesn't use so much memory, because
specific subpools added and APR buffered files avoided. But this
issues can reappear again in future, like it was in Subversion 1.8.x.

So, I'm still veto this change. Please revert it. Then I suggest you
make this stream comparision algorithm private for FSFS, becuase it's
makes sense only for 'expensive' streams.

I'm ready to discuss increasing SVN__STREAM_CHUNK_SIZE to 64k or any
other value, if there are serous reasons for doing this, but only
after reverting this change from trunk.

Ivan Zhakov
Received on 2014-04-18 18:22:40 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.