Ivan Zhakov wrote:
> Bert Huijben wrote:
>>> URL: http://svn.apache.org/viewvc?rev=1707196&view=rev
>>> Slightly optimize svn_stringbuf_from_stream() to avoid allocating twice
>>> more memory and unnecessary memcpy() when LEN_HINT is equal to final
>>> * subversion/libsvn_subr/stream.c
>>> (svn_stringbuf_from_stream): Always preallocate LEN_HINT +
>>> bytes to be able perform final read without stringbuf reallocation.
>> Can you explain why hint + MIN_READ_SIZE instead of something like
>> MAX(len_hint+1, MIN_READ_SIZE)
>> I don't know what MIN_READ_SIZE is from just this patch, but it could easily be
>> something like 16 Kbyte, while len_hint could be something like 16 for a file like
>> 'format' that was just statted to obtain the actual size
>> 16 Kbyte + 16 bytes for a 16 bytes file looks a bit large... And allocating 16 + 32
>> bytes is far more efficient than allocating that huge chunk at once.
> MIN_READ_SIZE is 64 bytes and it's locally defined in this function.
> We cannot use MAX(len_hint+1, MIN_READ_SIZE) because main loop
> reallocates buffer if don't have MIN_READ_SIZE remaining. The current
> code assumes that MIN_READ_SIZE is small value and I decided to leave
> this code intact. It could be improved of course, but I wanted to make
> minimum change first.
I don't think that was the right fix. One deficiency with the code is
that it doesn't break out of the loop when end-of-stream is detected,
which can be reported by a 'short' read. Your change doesn't solve
that problem in general, although it does solve it for the case where
the supplied length hint is exactly right.
We also have the function
which performs a virtually identical task. These two functions have
completely different implementations. svn_string_from_stream() does
break out of the loop as soon as it detects end-of-stream reported as
a short read. They also use completely different read sizes
(MIN_READ_SIZE = 64 bytes vs. SVN__STREAM_CHUNK_SIZE = 16384).
That is silly: there is no good reason for that, and it is a waste of
effort maintaining them separately, and discussing separate
optimizations for each.
We should make those two functions share the same "best" implementation.
Received on 2015-10-07 13:53:46 CEST