On 01.09.2015 20:26, Evgeny Kotkov wrote:
> Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> writes:
>
>> Yes. This is exactly why we can only use it when we have reasonable control
>> over the stream's usage, i.e. we can use it in our CL tools because all the
>> code that will be run is under our control. But we cannot make e.g.
>> svn_stream_for_stdin() use it by default.
> [...]
>
>> The best solution seems to be to allow for explicit resource management as
>> we do with other potentially "expensive" objects. r1700305 implements that.
> I have several concerns about these changes (r1698359 and r1700305):
FWIW: I agree with Evgeny's analysis and conclusions. There surely must
be a way to get reasonable performance from a generic stream without the
really flaky memory management that these changes bring.
One approach might be a similar buffered-stream wrapper that supports
mark/seek, but where the caller provides a (fixed-size) buffer and/or
buffer management callbacks. Something like that would make the
buffering explicit to the API consumer, although things might still
become tricky if such a stream is used in a generic stream context.
Perhaps such a buffered stream should be a completely different type of
object.
> As for the problem itself, if the way we currently process the input during
> svnadmin load and load-revprops is causing a noticeable overhead, I think that
> we should introduce -F (--file) option to both of these commands:
>
> svnadmin load /path/to/repos -F (--file) /path/to/dump
>
> svnadmin load-revprops /path/to/repos -F (--file) /path/to/dump
>
> As long as file streams support both svn_stream_seek() and svn_stream_mark(),
> this should avoid byte-by-byte processing of the input and get rid of the
> associated overhead.
This would not solve the common case where users dump/load without
incurring the possibly huge, or even unmanageable overhead of creating
an intermediate dumpfile.
-- Brane
Received on 2015-09-01 21:08:59 CEST