Stefan Sperling wrote:
> On Mon, Aug 16, 2010 at 10:04:31AM +0100, Julian Foad wrote:
>> On Mon, 2010-08-16 at 09:55 +0100, Julian Foad wrote:
>>> On Sat, 2010-08-14, stefan2_at_apache.org wrote:
>>>> Extend the stream API by three functions:
>>>> svn_stream_move_mark() to move an existing mark by some delta
>>> Hi Stefan.
>>> Unfortunately this is not a logical extension to the stream API.
>>> Subversion's svn_stream_t streams support arbitrary "filtering" or
>>> "translation" tasks that require a mostly sequential processing of the
>>> stream. One of the ideas embodied by the "get a mark" and "seek to a
>>> mark" paradigm is that the stream position is not expressible by a
>>> simple number (such as a byte offset from the start), because the stream
>>> may be performing translation whose state at a given position depends on
>>> the preceding content of the stream. Therefore "get a mark" does not
>>> return a scalar number but instead returns a stream state object which
>>> contains the translation state of the stream as well as the "mark" of
>>> the underlying stream if there is one.
>>> Seeking to an arbitrary new position requires telling the stream all of
>>> its relevant internal state at that position. "Move by +/- N bytes"
>>> does not seem to be an operation that can be supported.
I overgeneralized my use-case here: I actually needed "move forward" only.
The concept of "skip N bytes", however, is perfectly in line with the
stream semantics. Because this also fits my needs, I changed the API in
>> ... supported by all streams that support mark & seek. Of course some
>> streams could support it.
>> I don't like to see an API be complicated by lots of functions that may
>> or may not be supported, and functions to find out whether they are
>> supported. I would ask you to look for another way to realize most of
>> the speed gain that you were hoping to get from this extension.
The impact of this change is pretty large as all FSFS structures except for
the actual delta info will be processed by parsers using stream_readline().
Reading that data byte-by-byte from the APR file is by far the most
part of that parsing process. So, I would like to see that API extension
> If it provides a real performance advantage, maybe we should just switch
> some APIs which use streams to APR file handles where it makes sense.
> Files can be seeked via byte-offsets.
> The stream abstraction is nice, but it can get in the way.
> Even mark/seek support as currently implemented is already beyond the
> original idea of the stream abstraction. But we needed it for the patch code
> which is using streams to provide an interface for reading text from patch
> files in various ways. It's a nice abstraction for that purpose and should
> remain, but maybe Stefan Fuhrmann has good reasons for not using the
> stream abstraction elsewhere.
The problem is that a lot of parser code would need to change at least its
function signatures to string buffers and offsets instead of streams.
even sure that all streams are based on strings; some may be parsed on
APR file as well and use the same parser code (e.g. read hash from stream).
IOW, the least risky alternative approach would probably be the introduction
of a "kind-of-stream" concept that supports the extra functionality.
Received on 2010-08-17 23:58:09 CEST