Re: svn commit: r1136294 - in /subversion/trunk/subversion/libsvn_ra_serf: ra_serf.h util.c

From: Greg Stein <gstein_at_gmail.com>
Date: Thu, 16 Jun 2011 16:19:36 -0400

And hit the dev@ list this time...

On Thu, Jun 16, 2011 at 16:19, Greg Stein <gstein_at_gmail.com> wrote:
> On Thu, Jun 16, 2011 at 02:38, Lieven Govaerts <svnlgo_at_mobsol.be> wrote:
>>...
>>> (here for review; more code to follow...)
>>
>> Interesting patch.
>
> Was hoping you'd have time to look at it. Thanks!
>
>>...
>>> + /* If "pausing" is to be allowed, then this must refer to the connection's
>>> + bucket allocator. It will be used to hold content in memory, then
>>> + return it once it has been consumed.
>>> +
>>> + NOTE: if this field is NULL, then pausing is NOT allowed. */
>>> + serf_bucket_alloc_t *pause_alloc;
>>> +
>>> + /* If pausing is allowed, then callbacks can set this value to pause
>>> + the XML parsing. Note that additional elements may be parsed once
>>> + this value is set (as the current buffer is consumed; no further
>>> + buffers will be parsed until PAUSED is cleared).
>>> +
>>> + At some point, the callbacks should clear this value. The main
>>> + serf_context_run() loop will notice the change and resume delivery
>>> + of content to the XML parser. */
>>> + svn_boolean_t paused;
>>
>> If delivery is resumed from say disk, it might be 'a while' before all
>> cached data is consumed and svn is reading from the socket again. To
>> avoid dropped connections, we might design it so that even while the
>> XML parser is not paused, svn still continues to read from the socket
>> and adds the data to the memory/disk cache.
>
> Ah. Good point.
>
> I suspect that we'll generally be okay because the parser will paused
> once N requests get queued. And reaching N could be below the server
> timeout.
>
> I was certainly planning to consume from memory and disk, and append
> if more stuff arrived.
>
> Boy, it would be nice to have multiple threads :-)
>
>>> + /* While the XML parser is paused, content arriving from the server
>>> + must be saved locally. We cannot stop reading, or the server may
>>> + decide to drop the connection. The content will be stored in memory
>>> + up to a certain limit, and will then be spilled over to disk.
>>> +
>> Yes, this is the idea. I have been thinking of putting this
>> functionality in a bucket, e.g. a cache_bucket. Besides the callback
>> it fits the bucket model quite well.
>>
>> The benefits of such a bucket are improved locality of the code, but
>> also that you can put this cache_bucket in the chain before the
>> deflate_bucket, so that serf can stored the incoming data compressed
>> into memory, possibly not even requiring disk access.
>
> I saw this from your patch, but didn't think that it made sense.
>
> We need to read from the network and put that "somewhere" while the
> parser is paused. But what that means is you have a read/write bucket,
> which I disfavor. To draw a picture:
>
> read <- C <- N
>
> where C is your Cache bucket, and N is the Network bucket. But when
> the parser is paused, you have:
>
> C <- write <- read <- N
>
> That just doesn't look like a typical stack of buckets. I think the
> ideal bucket scenario is "build them once, then read-only." A bucket
> can certainly generate "more" data internally when a read request
> occurs. That is just dynamically fulfilling its logical content. But
> altering the content represented over time feels icky.
>
> Thus, I set up the pending_t structure. That structure would be quite
> useful in other areas of our codebase, but (until needed) I made it
> local to ra_serf for now.
>
>>...
>>> +/* As chunks of content arrive from the server, and we need to hold them
>>> + in memory (because the XML parser is paused), they are copied into
>>> + these buffers. The buffers are arranged into a linked list. */
>>> +struct pending_buffer_t {
>>> + apr_size_t size;
>>> + char data[PARSE_CHUNK_SIZE];
>>> + struct pending_memnode_t *next;
>>> +};
>>
>> Why not store this in an aggregate bucket? It has to be wrapped in a
>> bucket anyway to return it to the caller. You can still calculate the
>> size of the current buffer outside the aggregate bucket.
>
> I don't need the content in bucket form. My plan is to "read" from
> pending_t and call inject_to_parser(). This will likely be performed
> in the code block that calls serf_context_run(). The response handler
> for the network will append to the pending structures while the parser
> is paused or when pending data exists. When the parser is active, the
> response handler will inject its content (as the code is structured
> today). IOW, the response handler will not read from the pending data.
>
> Does the above sound reasonable?
>
> Cheers,
> -g
>
Received on 2011-06-16 22:20:09 CEST

This message: [ Message body ]
Next message: Johan Corveleyn: "Re: diff wish"
Previous message: Paul Burba: "Re: svn commit: r1132561 - /subversion/trunk/subversion/tests/cmdline/copy_tests.py"
Maybe in reply to: Lieven Govaerts: "Re: svn commit: r1136294 - in /subversion/trunk/subversion/libsvn_ra_serf: ra_serf.h util.c"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]