stefan2_at_apache.org wrote on Fri, Sep 07, 2012 at 00:14:28 -0000:
> Author: stefan2
> Date: Fri Sep 7 00:14:28 2012
> New Revision: 1381814
>
> URL: http://svn.apache.org/viewvc?rev=1381814&view=rev
> Log:
> Fix a performance issue created by directory deltification: rep streams
> don't support mark / seek / skip and make the line-based hash parser
> fall back to byte-by-byte read requests. With deltification, those read
> requests become extra expensive.
>
> Thus, read the whole content first (requires less memory than the final
> hash object) and parse them from a stringbuf-based stream. Also, use
> a temporary pool for all non-result objects.
>
So you're using a stringbuf stream instead of an undeltification stream
returned by read_representation() because the hash parser needs seek
support.
Isn't this fix the wrong approach? It sounds like it would be better to
fix either the deltification stream read_representation() represents, or
the consumer code in svn_hash_read2(), so they work nicely together.
(Whether that means removing the seek() requirement from the latter, or
making the former prepare undeltified bytes in bigger chunks in advance,
or ...)
The idea of streams is that they can be pushed into one another freely
--- injecting a stringbuf stream in the middle is more or less exactly
what streams were designed to avoid...!?
Cheers,
Daniel
> * subversion/libsvn_fs_fs/fs_fs.c
> (get_dir_contents): undeltify first, parse later
>
> Modified:
> subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c
>
> Modified: subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c
> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c?rev=1381814&r1=1381813&r2=1381814&view=diff
> ==============================================================================
> --- subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c (original)
> +++ subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c Fri Sep 7 00:14:28 2012
> @@ -5065,10 +5065,25 @@ get_dir_contents(apr_hash_t *entries,
> }
> else if (noderev->data_rep)
> {
> + /* use a temporary pool for temp objects.
> + * Also undeltify content before parsing it. Otherwise, we could only
> + * parse it byte-by-byte.
> + */
> + apr_pool_t *text_pool = svn_pool_create(pool);
> + apr_size_t len = noderev->data_rep->expanded_size;
> + svn_stringbuf_t *text = svn_stringbuf_create_ensure(len, text_pool);
> + text->len = len;
> +
> /* The representation is immutable. Read it normally. */
> - SVN_ERR(read_representation(&contents, fs, noderev->data_rep, pool));
> - SVN_ERR(svn_hash_read2(entries, contents, SVN_HASH_TERMINATOR, pool));
> + SVN_ERR(read_representation(&contents, fs, noderev->data_rep, text_pool));
> + SVN_ERR(svn_stream_read(contents, text->data, &text->len));
> SVN_ERR(svn_stream_close(contents));
> +
> + /* de-serialize hash */
> + contents = svn_stream_from_stringbuf(text, text_pool);
> + SVN_ERR(svn_hash_read2(entries, contents, SVN_HASH_TERMINATOR, pool));
> +
> + svn_pool_destroy(text_pool);
> }
>
> return SVN_NO_ERROR;
>
>
Received on 2012-09-10 20:02:44 CEST