[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1072519 - Add svn_stream_skip(), svn_stream_buffered(), svn_stream_supports_mark()

From: Stefan Fuhrmann <eqfox_at_web.de>
Date: Thu, 02 Jun 2011 00:08:16 +0200

On 01.06.2011 16:08, Julian Foad wrote:
> On Sun, 2011-02-20, stefan2_at_apache.org wrote:
>> Merge all changes (r1068695 - r1072516) from the
>> integrate-stream-api-extensions.
>>
>> These patches add svn_stream_skip(), svn_stream_buffered()
>> and svn_stream_supports_mark().
> A belated review.
>
<snip>
>> +/** Returns @c TRUE if the generic @a stream supports svn_stream_mark().
>> + *
>> + * @see svn_stream_mark()
>> + * @since New in 1.7.
>> + */
>> +svn_boolean_t
>> +svn_stream_supports_mark(svn_stream_t *stream);
> Would a better name be svn_stream_supports_mark_seek()?
>
> I do wonder whether we need this. I guess the idea is it's supposed to
> be faster than calling svn_stream_mark() and checking for the error
> SVN_ERR_STREAM_SEEK_NOT_SUPPORTED.
This simply tell the caller whether svn_stream_mark()
is implemented / supported by the particular stream.
Its result may not even be bound to the type of stream
but be specific to a given instance (currently, all streams
return a static response, though).

Misusing svn_stream_mark() to mimic that behavior
would be bad style and leaves an unused mark structure
in case of success. That alone already justifies the
check function, IMO.

You might rename it to svn_stream_supports_marks(),
if that feels more natural. But I would not expect any
stream to allow for setting a mark but (predictably)
failing to seek to it later.
>> +/** Return whether this generic @a stream uses internal buffering.
>> + * This may be used to work around subtle differences between buffered
>> + * an non-buffered APR files.
>> + *
>> + * @since New in 1.7.
>> + */
>> +svn_boolean_t
>> +svn_stream_buffered(svn_stream_t *stream);
> I renamed the symbols from ...buffered... to ...is_buffered... in
> r1130118, for clarity.
+1 on that.
> Does "is buffered" mean "this layer of the stream implements its own
> buffering layer"? Does it mean "this stream knows that there is a RAM
> buffer at some layer of the stream implementation, so some reads from
> this stream can be served without I/O latency"? Should a layered
> stream, for example a checksumming stream, delegate to what its
> underlying stream says? The semantics seem insufficiently clear for a
> general-purpose stream.
I admit that the semantics are fuzzy. The thing that I
need to test for is the following: Is it more efficient
to read a larger number (e.g. 80) of bytes at once
although I will re-read half of that later (because we
found the EOL somewhere in the middle) or is it better
to read one byte at a time while avoiding duplicate
reads.

In practice, it will often be equivalent to mark support
but it is conceptually different. Having a memory
buffer somewhere *accessible* (block reads are
efficient) in the stream implementation layers is a
good enough indicator to the rather specific property
that we want to test for.

I'm open to suggestions. It's not particularly elegant
but the best thing I could come up with.

> The "is buffered" API has one caller, stream_readline(), which says
>
> if (stream supports mark&& is buffered&& ...) then
> {
> /* We can efficiently read chunks speculatively and
> reposition the stream pointer to the end of the
> line once we found that. */
> readline_chunky();
> }
> else
> {
> readline_bytewise();
> }
>
> But I don't understand that. I understand why we'd need to use a
> byte-by-byte implementation if the stream doesn't support mark and seek,
> of course, but why would we need to do so if the stream claims not to be
> buffered?
Because it would be slower by at least the excess-
read-to-line-length ratio plus the mark handling
overhead.
> The relevant question right now is: is there a clear benefit for this
> being public in 1.7? I think not, so I propose to make the "is
> buffered" API private by renaming the relevant symbols to
> "svn_stream__*". Or move them into private/svn_io_private.h, as Stefan
> just suggested. Questions of semantics can then wait.
Make it private. It simply felt a little odd to require
every stream to implement a private API as well.

-- Stefan^2.
Received on 2011-06-02 00:08:45 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.