[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r17559 - in trunk/subversion: libsvn_subr

From: Peter N. Lundblad <peter_at_famlundblad.se>
Date: 2005-11-30 14:31:19 CET

On Tue, 29 Nov 2005 dionisos@tigris.org wrote:

> Author: dionisos
> Date: Tue Nov 29 17:00:42 2005
> New Revision: 17559
>
> Modified:
> trunk/subversion/include/svn_subst.h
> trunk/subversion/libsvn_subr/subst.c
>
> Log:
> Introduce a stream type which does EOL translation
> and keyword expansion on the fly.
>
> * include/svn_subst.h
> * libsvn_subr/subst.c
> (svn_subst_stream_translated): New. Returns a read/write stream
> doing keyword and eol translation, proxying another stream.
>
> * libsvn_subr/subst.c
> (translate_chunk): New. Abstracted from svn_subst_translate_stream3,
> adapted to use the 'const char' buffer argument required for write
> stream translation and store translation states in a baton.
> (translation_baton): New. Structure for use with translate_chunk
> to store intermediate state.
> (create_translation_baton): New local function to create and initialize
> a translation_baton.
> (translated_stream_baton): New. Baton to hold values associated with
> a translating stream.
> (translated_stream_read, translated_stream_write,
> translated_stream_close): New. Stream callbacks for translated streams.
> (svn_subst_translate_stream3): Reimplement using translate_chunk().
>

You didn't mention the change to svn_subst_copy_and_translate3 here.

>
> Modified: trunk/subversion/include/svn_subst.h
> Url: http://svn.collab.net/viewcvs/svn/trunk/subversion/include/svn_subst.h?rev=17559&p1=trunk/subversion/include/svn_subst.h&p2=trunk/subversion/include/svn_subst.h&r1=17558&r2=17559
> ==============================================================================
> --- trunk/subversion/include/svn_subst.h (original)
> +++ trunk/subversion/include/svn_subst.h Tue Nov 29 17:00:42 2005
> @@ -227,6 +227,24 @@
> svn_boolean_t expand,
> apr_pool_t *pool);
>
> +/** Return a stream which performs eol translation and keyword

> + * expansion like svn_subst_translate_stream3() except that
> + * it's done on-the-fly when reading or writing to @a stream.
> + *

Maybe I seem to play silly, but this may sound like it will somehow
intercept @a stream, so when you write to @a stream, you get translation.
Maybe modify the docstring to look more like that of
svn_stream_compressed().

> + * Read operations from and write operations to the stream
> + * perform the same operation: if @a expand is @c FALSE, both
> + * contract keywords.
> + *
Say that the stream is allocated in @a pool. Is it OK to mix read and
write operations?

> + * @since New in 1.4

Missing period.

> + */
> +svn_stream_t *
> +svn_subst_stream_translated (svn_stream_t *stream,
> + const char *eol_str,
> + svn_boolean_t repair,
> + apr_hash_t *keywords,
> + svn_boolean_t expand,
> + apr_pool_t *pool);
> +
> /** Similar to svn_subst_translate_stream3() except relies upon a
> * @c svn_subst_keywords_t struct instead of a hash for the keywords.
> *
>
> Modified: trunk/subversion/libsvn_subr/subst.c
> Url: http://svn.collab.net/viewcvs/svn/trunk/subversion/libsvn_subr/subst.c?rev=17559&p1=trunk/subversion/libsvn_subr/subst.c&p2=trunk/subversion/libsvn_subr/subst.c&r1=17558&r2=17559
> ==============================================================================
> --- trunk/subversion/libsvn_subr/subst.c (original)
> +++ trunk/subversion/libsvn_subr/subst.c Tue Nov 29 17:00:42 2005
> @@ -823,42 +823,88 @@
> return svn_subst_translate_stream3 (s, d, eol_str, repair, kh, expand, pool);
> }
>
> -svn_error_t *
> -svn_subst_translate_stream3 (svn_stream_t *s, /* src stream */
> - svn_stream_t *d, /* dst stream */
> - const char *eol_str,
> - svn_boolean_t repair,
> - apr_hash_t *keywords,
> - svn_boolean_t expand,
> - apr_pool_t *pool)

> +struct translation_baton

This wants a docstring.

> {
> - char *buf;
> - const char *p, *interesting;
> - apr_size_t len, readlen;
> - apr_size_t eol_str_len = eol_str ? strlen (eol_str) : 0;
> - char newline_buf[2] = { 0 };
> - apr_size_t newline_off = 0;
> - char keyword_buf[SVN_KEYWORD_MAX_LEN] = { 0 };
> - apr_size_t keyword_off = 0;
> - char src_format[2] = { 0 };
> - apr_size_t src_format_len = 0;
> -
> - buf = apr_palloc (pool, SVN_STREAM_CHUNK_SIZE + 1);
> + const char *eol_str;
> + svn_boolean_t repair;
> + apr_hash_t *keywords;
> + svn_boolean_t expand;
> + const char *interesting;
> + apr_size_t eol_str_len;
> + char newline_buf[2];
> + apr_size_t newline_off;
> + char keyword_buf[SVN_KEYWORD_MAX_LEN];
> + apr_size_t keyword_off;
> + char src_format[2];
> + apr_size_t src_format_len;

Here is some strange formatting; some of these have the field names
aligned. Also, a short docstring for each field would be useful.

> +} translation_baton;
> +
> +
> +/* Allocate a baton for use with translate_chunk() in POOL and
> + * initialize it for the first iteration.
> + *
> + * The caller must assure that EOL_STR and KEYWORDS are either
> + * allocated in POOL, or at least have the same guaranteed life time
> + * as the baton created.
> + *

Couldn't the last paragraph be simplified something like:
...have a liftetime at least as long as that of POOL.
> + */
> +
> +static struct translation_baton *
> +create_translation_baton (const char *eol_str,
> + svn_boolean_t repair,
> + apr_hash_t *keywords,
> + svn_boolean_t expand,
> + apr_pool_t *pool)
> +{
> + struct translation_baton *b = apr_palloc (pool, sizeof (*b));
> + int i;
>
> /* For efficiency, convert an empty set of keywords to NULL. */
> if (keywords && (apr_hash_count (keywords) == 0))
> keywords = NULL;
>
> - /* The docstring requires that *some* translation be requested. */
> - assert (eol_str || keywords);
> - interesting = (eol_str && keywords) ? "$\r\n" : eol_str ? "\r\n" : "$";
> + b->eol_str = eol_str;
> + b->eol_str_len = eol_str ? strlen (eol_str) : 0;
> + b->repair = repair;
> + b->keywords = keywords;
> + b->expand = expand;
> + b->interesting = (eol_str && keywords) ? "$\r\n" : eol_str ? "\r\n" : "$";
> + b->newline_buf[0] = '\0';
> + b->newline_buf[1] = '\0';

Does something depend on the characters to be zero?

> + b->newline_off = 0;
> + for (i = 0; i < SVN_KEYWORD_MAX_LEN; i++)
> + b->keyword_buf[i] = '\0';

Why not use memset instead of this explicit loop? And same question as
above.

> + b->keyword_off = 0;
> + b->src_format[0] = '\0';
> + b->src_format[1] = '\0';

And here.

> +/* Translate eols and keywords of a 'chunk' of characters BUF of size BUFLEN
> + * according to the settings and state stored in baton B.
> + *
> + * Write output to stream D.
> + *

One-letter variable names are sometimes convenient, but make the code
harder to read. We use b in many places for baton, but could you please
change d to dst or something?

> + * To finish a series of chunk translations, flush all buffers by calling
> + * this routine with a NULL value for BUF.

"Use POOL for temporary allocations".

> - /* We're in the boring state; look for interest characters.
> - * For lack of a memcspn(), manually skip past NULs. */
> + /* We're in the boring state; look for interest characters. */
> len = 0;
> - while (1)
> - {
> - len += strcspn (p + len, interesting);
> - if (p[len] != '\0' || p + len == buf + readlen)
> - break;
> - len++;
> - }
> +
> + /* Fake strcspn() with a length parameter, strncspn if you like,
> + because we're not sure the buffer ends with a NUL character.
> +

I understand that you did this change because before we had a sentinel
value at the end of the buffer... I think the new comment is misleading,
because what we actually wanted was some memcspn, because a) we don't have
a terminating NUL and b) we want to allow NUL characters in the data.

> + Also, reject NUL characters explicitly, since index()
> + considers them part of the string argument, but we don't...

What do you mean by "reject"? I'd say "skip" for the reason stated above.

> + */
> + while ((p + len) < end
> + && (! p[len] || ! index (interesting, p[len])))

Looking in my manpages on Linux, it says that index is a BSD extension,
while strchr is part of ISO C. Why not use that instead?

> +struct translated_stream_baton

Docstring.

> +{
> + svn_stream_t *stream;
> + struct translation_baton *in_baton, *out_baton;
> + svn_boolean_t written;
> + svn_stringbuf_t *readbuf;
> + char *buf;
> + apr_size_t readbuf_off;
> + apr_pool_t *pool;
> +} translated_stream_baton;
> +
> +
> +static svn_error_t *
> +translated_stream_read (void *baton,
> + char *buffer,
> + apr_size_t *len)
> +{
> + struct translated_stream_baton *b = baton;
> + apr_size_t readlen = SVN_STREAM_CHUNK_SIZE;

I think we should get rid of SVN_STREAM_CHUNK_SIZE because it is a very
big buffer, but that shouldn't be part of this commit, I guess.

> + apr_size_t unsatisfied = *len;
> + apr_size_t off = 0;
> + apr_pool_t *iterpool;
> + svn_pool_clear (b->pool);
> +
> + iterpool = svn_pool_create (b->pool);
> + while (readlen == SVN_STREAM_CHUNK_SIZE && unsatisfied > 0)
> + {
> + apr_size_t to_copy;
> +
> + /* fill read buffer, if necessary */
> + if (! (b->readbuf_off < b->readbuf->len))
> + {
> + svn_stream_t *buf_stream;
> +
> + svn_stringbuf_setempty (b->readbuf);
> + SVN_ERR (svn_stream_read (b->stream, b->buf, &readlen));
> + buf_stream = svn_stream_from_stringbuf (b->readbuf, iterpool);
> +
> + SVN_ERR (translate_chunk (buf_stream, b->in_baton, b->buf,
> + readlen, iterpool));
> +
> + if (readlen != SVN_STREAM_CHUNK_SIZE)
> + SVN_ERR (translate_chunk (buf_stream, b->in_baton, NULL, 0,
> + iterpool));
> +
> + SVN_ERR (svn_stream_close (buf_stream));
> + }
> +
> + /* Satisfy from the read buffer */
> + to_copy = (b->readbuf->len > unsatisfied)
> + ? unsatisfied : b->readbuf->len;
> + memcpy (buffer + off, b->readbuf->data + b->readbuf_off, to_copy);
> + off += to_copy;
> + b->readbuf_off += to_copy;
> + unsatisfied -= to_copy;
> + }

iterpool is never cleared.

> +
> + *len -= unsatisfied;
>
...or destroyed for that matter:-)

> return SVN_NO_ERROR;
> }
>
...
> +
> +svn_stream_t *
> +svn_subst_stream_translated (svn_stream_t *stream,
> + const char *eol_str,
> + svn_boolean_t repair,
> + apr_hash_t *keywords,
> + svn_boolean_t expand,
> + apr_pool_t *pool)
...

> + /* Setup the baton fields */
> + baton->stream = stream;
> + baton->in_baton
> + = create_translation_baton (eol_str, repair, keywords, expand, pool);
> + baton->out_baton
> + = create_translation_baton (eol_str, repair, keywords, expand, pool);
> + baton->written = FALSE;
> + baton->readbuf = svn_stringbuf_create ("", pool);

> + baton->readbuf_off = 0;
> + baton->buf = apr_palloc (pool, SVN_STREAM_CHUNK_SIZE + 1);
> + baton->pool = svn_pool_create (pool);
> +
The above may leave 256K of buffers allocated in pool after using this
stream. Normally it is the callers responsibility to handle pool memory,
but this might be too much. My preferred solution is to use a smaller
buffer size (say 8K or something), for several reasons. If you don't want
to do that now, maybe you should use a subpool here? Maybe use
baton->pool; it doesn't seem to be used except for allocating the
iterpool.

> +
> +svn_error_t *
> +svn_subst_translate_stream3 (svn_stream_t *s, /* src stream */
> + svn_stream_t *d, /* dst stream */
> + const char *eol_str,
> + svn_boolean_t repair,
> + apr_hash_t *keywords,
> + svn_boolean_t expand,
> + apr_pool_t *pool)
> +{
> + apr_pool_t *subpool = svn_pool_create (pool);
> + apr_pool_t *iterpool = svn_pool_create (subpool);
> + struct translation_baton *baton;
> + apr_size_t readlen = SVN_STREAM_CHUNK_SIZE;
> + char *buf = apr_palloc (subpool, SVN_STREAM_CHUNK_SIZE);

Here, someone thought using the main poool for the big buffer wasn't
appropriate:-)

Other than that, nice work, Erik!

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Nov 30 14:47:39 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.