[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1698359 - in /subversion/trunk/subversion: include/svn_io.h libsvn_subr/stream.c svnadmin/svnadmin.c svnfsfs/load-index-cmd.c tests/libsvn_subr/stream-test.c

From: Branko Čibej <brane_at_wandisco.com>
Date: Tue, 1 Sep 2015 21:08:46 +0200

On 01.09.2015 20:26, Evgeny Kotkov wrote:
> Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> writes:
>
>> Yes. This is exactly why we can only use it when we have reasonable control
>> over the stream's usage, i.e. we can use it in our CL tools because all the
>> code that will be run is under our control. But we cannot make e.g.
>> svn_stream_for_stdin() use it by default.
> [...]
>
>> The best solution seems to be to allow for explicit resource management as
>> we do with other potentially "expensive" objects. r1700305 implements that.
> I have several concerns about these changes (r1698359 and r1700305):

FWIW: I agree with Evgeny's analysis and conclusions. There surely must
be a way to get reasonable performance from a generic stream without the
really flaky memory management that these changes bring.

One approach might be a similar buffered-stream wrapper that supports
mark/seek, but where the caller provides a (fixed-size) buffer and/or
buffer management callbacks. Something like that would make the
buffering explicit to the API consumer, although things might still
become tricky if such a stream is used in a generic stream context.
Perhaps such a buffered stream should be a completely different type of
object.

> As for the problem itself, if the way we currently process the input during
> svnadmin load and load-revprops is causing a noticeable overhead, I think that
> we should introduce -F (--file) option to both of these commands:
>
> svnadmin load /path/to/repos -F (--file) /path/to/dump
>
> svnadmin load-revprops /path/to/repos -F (--file) /path/to/dump
>
> As long as file streams support both svn_stream_seek() and svn_stream_mark(),
> this should avoid byte-by-byte processing of the input and get rid of the
> associated overhead.

This would not solve the common case where users dump/load without
incurring the possibly huge, or even unmanageable overhead of creating
an intermediate dumpfile.

-- Brane
Received on 2015-09-01 21:08:59 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.