Hi Greg,
On Thu, Jun 16, 2011 at 8:00 AM, <gstein_at_apache.org> wrote:
> Author: gstein
> Date: Thu Jun 16 06:00:55 2011
> New Revision: 1136294
>
> URL: http://svn.apache.org/viewvc?rev=1136294&view=rev
> Log:
> Sketch out some structures for pausing the XML parsing of the "update"
> report. This will allow us to apply a limit to the outstanding number of
> requests queued into the serf context.
>
> No functional changes. This revision simply adds some structures that will
> be needed to perform the pausing.
>
> (here for review; more code to follow...)
Interesting patch.
>
> * subversion/libsvn_ra_serf/ra_serf.h:
> (svn_ra_serf__xml_parser_t): add PAUSE_ALLOC, PAUSED, and PENDING
> members to manage the parse-pause process.
>
> * subversion/libsvn_ra_serf/util.c:
> (PARSE_CHUNK_SIZE, pending_buffer_t, svn_ra_serf__pendint_t,
pendinG_t.
> SPILL_SIZE): new defines and structures.
>
> Modified:
> subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h
> subversion/trunk/subversion/libsvn_ra_serf/util.c
>
> Modified: subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h
> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h?rev=1136294&r1=1136293&r2=1136294&view=diff
> ==============================================================================
> --- subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h (original)
> +++ subversion/trunk/subversion/libsvn_ra_serf/ra_serf.h Thu Jun 16 06:00:55 2011
> @@ -590,6 +590,31 @@ struct svn_ra_serf__xml_parser_t {
>
> /* If an error occurred, this value will be non-NULL. */
> svn_error_t *error;
> +
> + /* If "pausing" is to be allowed, then this must refer to the connection's
> + bucket allocator. It will be used to hold content in memory, then
> + return it once it has been consumed.
> +
> + NOTE: if this field is NULL, then pausing is NOT allowed. */
> + serf_bucket_alloc_t *pause_alloc;
> +
> + /* If pausing is allowed, then callbacks can set this value to pause
> + the XML parsing. Note that additional elements may be parsed once
> + this value is set (as the current buffer is consumed; no further
> + buffers will be parsed until PAUSED is cleared).
> +
> + At some point, the callbacks should clear this value. The main
> + serf_context_run() loop will notice the change and resume delivery
> + of content to the XML parser. */
> + svn_boolean_t paused;
If delivery is resumed from say disk, it might be 'a while' before all
cached data is consumed and svn is reading from the socket again. To
avoid dropped connections, we might design it so that even while the
XML parser is not paused, svn still continues to read from the socket
and adds the data to the memory/disk cache.
> +
> + /* While the XML parser is paused, content arriving from the server
> + must be saved locally. We cannot stop reading, or the server may
> + decide to drop the connection. The content will be stored in memory
> + up to a certain limit, and will then be spilled over to disk.
> +
Yes, this is the idea. I have been thinking of putting this
functionality in a bucket, e.g. a cache_bucket. Besides the callback
it fits the bucket model quite well.
The benefits of such a bucket are improved locality of the code, but
also that you can put this cache_bucket in the chain before the
deflate_bucket, so that serf can stored the incoming data compressed
into memory, possibly not even requiring disk access.
> + See libsvn_ra_serf/util.c */
> + struct svn_ra_serf__pending_t *pending;
> };
>
> /*
>
> Modified: subversion/trunk/subversion/libsvn_ra_serf/util.c
> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_ra_serf/util.c?rev=1136294&r1=1136293&r2=1136294&view=diff
> ==============================================================================
> --- subversion/trunk/subversion/libsvn_ra_serf/util.c (original)
> +++ subversion/trunk/subversion/libsvn_ra_serf/util.c Thu Jun 16 06:00:55 2011
> @@ -52,6 +52,45 @@
> #define XML_STATUS_ERROR 0
> #endif
>
> +
> +#define PARSE_CHUNK_SIZE 8000
> +
> +/* As chunks of content arrive from the server, and we need to hold them
> + in memory (because the XML parser is paused), they are copied into
> + these buffers. The buffers are arranged into a linked list. */
> +struct pending_buffer_t {
> + apr_size_t size;
> + char data[PARSE_CHUNK_SIZE];
> + struct pending_memnode_t *next;
> +};
Why not store this in an aggregate bucket? It has to be wrapped in a
bucket anyway to return it to the caller. You can still calculate the
size of the current buffer outside the aggregate bucket.
> +
> +
> +/* This structure records pending data for the parser in memory blocks,
> + and possibly into a temporary file if "too much" content arrives. */
> +struct svn_ra_serf__pending_t {
> + /* The amount of content in memory. */
> + apr_size_t memory_size;
> +
> + /* HEAD points to the first block of the linked list of buffers.
> + TAIL points to the last block, for quickly appending more blocks
> + to the overall list. */
> + struct pending_buffer_t *head;
> + struct pending_buffer_t *tail;
> +
> + /* Once MEMORY_SIZE exceeds SPILL_SIZE, then arriving content will be
> + appended to the (temporary) file indicated by SPILL. */
> + apr_file_t *spill;
> +
> + /* As we consume content from SPILL, this value indicates where we
> + will begin reading. */
> + apr_off_t spill_start;
> +};
> +
> +/* We will store one megabyte in memory, before switching to store content
> + into a temporary file. */
> +#define SPILL_SIZE 1000000
> +
> +
>
> static const apr_uint32_t serf_failure_map[][2] =
> {
>
regards,
Lieven
Received on 2011-06-16 08:39:06 CEST