On Wed, Sep 9, 2015 at 12:31 PM, Evgeny Kotkov <evgeny.kotkov_at_visualsvn.com>
> Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> writes:
> >> The original commit begins using svn_stream_wrap_buffered_read() during
> >> svnadmin load-revprops and svnfsfs load-index. This patch, however,
> >> something entirely different and adds buffering to *every* stdin.
> > Sorry for the confusion. I did not intend to actually change the
> > implementation of svn_stream_for_stdin this way but simply tried to
> > demonstrate a problem with APR buffer reads for "streamy" file handles.
> Thank you for the explanation.
After some debugging I found the reason for the ra-test hang. Once I knew
it's been kind of obvious from the code: For buffered files,
more or less equivalent to apr_file_write_full().
Changing that in APR might break APR internals (buffer may be half-filled
but not be at EOF) as well as user code (read_full required where a normal
read used to be sufficient).
> > The underlying problem is still present: If stdin can't deliver data fast
> > enough (e.g. some hick-up / long latency on the producer side of a dump |
> > load), a buffered APR file will error out while a non-buffered one will
> > simply wait & retry.
> > However, I have yet to try and provoke the error specifically for the
> > dump | load scenario.
> I think that this problem is nonexistent.
> A program doesn't need to handle EAGAIN during read() unless it puts the
> into the non-blocking mode with O_NONBLOCK . We don't do that for
> and, irrespectively of what happens with the data on the other side of the
> pipe, read() is going to block until the requested data is available.
>  http://pubs.opengroup.org/onlinepubs/9699919799/functions/read.html
You are right. I tried various methods to delay the producer side and
it never caused the consumer to error out. The ra-test hang mislead
me into believing that the EAGAIN handling was the problem.
So, you are absolutely right about APR buffering being enough for our
non-interactive "read stdin as a single large file" usage in load & friends.
My last concern is as follows. We need an alternative implementation
for svn_stream_for_stdin. I see 3 options:
1. svn_stream_for_stdin2 with buffered option.
2. Always enable buffering in svn_stream_for_stdin.
3. Some private API.
The problem with 1. is that if we use APR_BUFFERED for it, the new
API will be add leakage to the abstraction: the user has to know when
it is safe to enable buffering. This problem does not occur if the wrapper
stream is being used instead.
Thus, variant 2 is impossible with APR_BUFFERED but quite possible
using the buffering wrapper. The only way it could cause a problem is
if some code was to rely on the actual size of a read request from stdin.
Since the latter is managed by the OS + C runtime, it is hard to think of
a way to make this happen - but still. Having a fall-back would be nice,
i.e. variant 1 is safer than 2.
Option 3 is basically saying "the public API is not good enough but we
don't want to give you a better one". It is, however, the only variant
where I would be fully comfortable just using APR buffering.
Received on 2015-09-10 15:28:59 CEST