Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 2)

From: Daniel Trebbien <dtrebbien_at_gmail.com>
Date: Tue, 21 Sep 2010 08:12:38 -0700

On Sun, Sep 19, 2010 at 9:39 PM, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
>> * subversion/include/svn_subst.h
>> Added documentation of the new public API function
>> svn_subst_translate_string2().
>>
>
> Should use this syntax:
>
> * subversion/include/svn_subst.h
> (svn_subst_translate_string2): New function.

Okay.

>> * subversion/libsvn_subr/subst.c
>> (translation_baton): Added TRANSLATED_EOL field. It's where translate_chunk()
>> records that it translated a line ending.
>> (create_translation_baton): Added a new parameter TRANSLATED_EOL. Initialize
>> the TRANSLATED_EOL field of the resulting translation_baton to this value.
>> (translate_chunk): After translating a line ending, set the value that is
>> pointed to by TRANSLATED_EOL (from the translation baton) to TRUE.
>
> Could just say "Record that an EOL translation has been done" or similar.
>
> In general, the log message looks good, but to my taste it is borders
> the 'too verbose a log message' line. (I might have written the same
> without the last sentence of each bullet.) Remember that the log
> message must never /repeat/ the code (or in-code comments); rather, it
> should describe them from a high level.

Okay. I will attempt to make the log message less verbose.

>> Index: subversion/include/svn_subst.h
>> ===================================================================
>> --- subversion/include/svn_subst.h (revision 996730)
>> +++ subversion/include/svn_subst.h (working copy)
>> @@ -598,6 +598,26 @@ svn_error_t *svn_subst_translate_string(svn_string
>> const char *encoding,
>> apr_pool_t *pool);
>>
>> +/** Translate the data in @a value (assumed to be in encoded in charset
>> + * @a encoding) to UTF8 and LF line-endings. If @a encoding is @c NULL,
>> + * then assume that @a value is in the system-default language encoding.
>> + * If @p translated_to_utf8 is not @c NULL, then
>> + * @code *translated_to_utf8 @endcode is set to @c TRUE if at least one
>
> You use @a, @p, and @code. Existing code uses only @a. Why the
> difference?

Oops. @p should be @a.

I believe that in Doxygen, @c WORD is short for @code WORD @endode.

>> + * character of @p value in the source charset was translated to UTF-8;
>
> Need the word "to" after the semicolon for the English to be correct.

Fixed. (When I say "fixed", I mean that I committed a fix in my own
fork of Subversion: http://github.com/dtrebbien/subversion )

>> + * @c FALSE otherwise. If @p translated_line_endings is not @c NULL, then
>> + * @code *translated_line_endings @endcode is set to @c TRUE if at least one
>> + * line ending was changed to LF; @c FALSE otherwise. Return the translated data in @a *new_value,
>> + * allocated in @a pool.
>> + *
>
> Need to rewrap.

Fixed.

>> + * @since New in 1.6
>
> s/1.6/1.7/

Fixed.

>> + */
>> +svn_error_t *svn_subst_translate_string2(svn_string_t **new_value,
>
> I realize that you followed the precedent of svn_subst_translate_string(),
> but please:
>
> Return type on own line, function name on the next line.

Fixed.

>> + svn_boolean_t *translated_to_utf8,
>> + svn_boolean_t *translated_line_endings,
>> + const svn_string_t *value,
>> + const char *encoding,
>> + apr_pool_t *pool);
>> +
>> /** Translate the data in @a value from UTF8 and LF line-endings into
>
> Need to update svn_subst_translate_string()'s docstring and
> implementation and mark it deprecated:
> http://subversion.apache.org/docs/community-guide/community-guide.html#deprecation
> http://mid.gmane.org/20100814132023.089A223889B3@eris.apache.org

Okay.

>> * native locale and native line-endings, or to the output locale if
>> * @a for_output is TRUE. Return the translated data in @a
>
>> Index: subversion/libsvn_subr/subst.c
>> ===================================================================
>> --- subversion/libsvn_subr/subst.c (revision 996730)
>> +++ subversion/libsvn_subr/subst.c (working copy)
>> @@ -765,6 +765,7 @@ svn_subst_keywords_differ2(apr_hash_t *a,
>> struct translation_baton
>> {
>> const char *eol_str;
>> + svn_boolean_t *translated_eol;
>> svn_boolean_t repair;
>> apr_hash_t *keywords;
>> svn_boolean_t expand;
>> @@ -805,6 +806,7 @@ struct translation_baton
>> */
>> static struct translation_baton *
>> create_translation_baton(const char *eol_str,
>> + svn_boolean_t *translated_eol,
>> svn_boolean_t repair,
>> apr_hash_t *keywords,
>> svn_boolean_t expand,
>> @@ -818,6 +820,7 @@ create_translation_baton(const char *eol_str,
>>
>> b->eol_str = eol_str;
>> b->eol_str_len = eol_str ? strlen(eol_str) : 0;
>> + b->translated_eol = translated_eol;
>> b->repair = repair;
>> b->keywords = keywords;
>> b->expand = expand;
>> @@ -884,6 +887,8 @@ translate_chunk(svn_stream_t *dst,
>> b->src_format,
>> &b->src_format_len, b->newline_buf,
>> b->newline_off, dst, b->repair));
>> + if (b->translated_eol)
>> + *b->translated_eol = TRUE;
>>
>
> I don't have the patience to dive into this function right now, but the
> patch is large enough that I'll review the rest and come back to this
> some other time.
>
> <appendum>
> There are several calls to translate_newline() in this function, but you
> added the '*translated_eol = TRUE' line only after one of them. That
> triggers a red light for me...
> </appendum>
>
> (PS: "appendum"? anyone got a better suggestion?)
> (PPS: "addendum".)

Oh, I didn't notice. I will have to look into this issue.

>> /* Jam the text into the destination stream (to translate it). */
>> SVN_ERR(svn_stream_write(dst_stream, src, &len));
>> @@ -1362,6 +1386,19 @@ svn_subst_read_specialfile(svn_stream_t **stream,
>> return SVN_NO_ERROR;
>> }
>>
>> +svn_error_t *
>> +svn_subst_translate_cstring2(const char *src,
>> + const char **dst,
>> + const char *eol_str,
>> + svn_boolean_t repair,
>> + apr_hash_t *keywords,
>> + svn_boolean_t expand,
>> + apr_pool_t *pool)
>> +{
>> + return translate_cstring2(dst, NULL, src, eol_str, repair, keywords, expand,
>> + pool);
>> +}
>
> So, this is revved because svn_subst_translate_string2() needs it (and
> svn_subst_translate_string() doesn't).

I am unsure of what you mean.

I took the original implementation of `svn_subst_translate_cstring2`
and placed it in the new static utility function `translate_cstring2`.
`translate_cstring2` accepts another parameter, and its definition
(the old definition on `svn_subst_translate_cstring2`) was
appropriately modified to handle this new parameter. This way, I don't
have to modify the public API.

>> +
>> /* Given a special file at SRC, generate a textual representation of
>> it in a normal file at DST. Perform all allocations in POOL. */
>> /* ### this should be folded into svn_subst_copy_and_translate3 */
>> @@ -1714,6 +1751,54 @@ svn_subst_translate_string(svn_string_t **new_valu
>>
>>
>> svn_error_t *
>> +svn_subst_translate_string2(svn_string_t **new_value,
>> + svn_boolean_t *translated_to_utf8,
>> + svn_boolean_t *translated_line_endings,
>> + const svn_string_t *value,
>> + const char *encoding,
>> + apr_pool_t *pool)
>> +{
>
> Why is this adding entirely new lines of code (the diff shows only /^+/
> lines)? Normally we make the new function by editing the old one
> in-place, and then re-create the old one as a wrapper (like
> translate_cstring2()).
>
> The fact you didn't do so makes me suspect you just copied the code of
> svn_subst_translate_string(), which is bad for several reasons:
>
> * It's code duplication. If we ever make a bugfix, we'll have to
> remember to make it to both versions of the function.
>
> * It makes it impossible for me to see /from the unidiff/ what's the
> code difference between the current and being-added versions of the
> function.

Yes, I did copy-and-paste. I see your point, though, and I am going to
correct this by adding another `static` function that both will call.

> Out of patience for one sitting, so not reviewing these now. (I feel they're
> orthogonal/independent of the parts I'd already reviewed.)
>
> I suggest you go ahead (respond to the review, submit an updated patch,
> etc) without waiting; the svnsync part will eventually get its share of
> reviews. Also, you may want to consider splitting this patch into two
> parts (subst work separate from svnsync work).

I like the idea of splitting this into two patches.
Received on 2010-09-21 17:13:17 CEST

This message: [ Message body ]
Next message: Daniel Trebbien: "Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 2)"
Previous message: Stefan Sperling: "Re: svn commit: r999421 - /subversion/branches/atomic-revprop/subversion/svnsync/main.c"
In reply to: Daniel Shahaf: "Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 2)"
Next in thread: Daniel Trebbien: "Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 2)"
Reply: Daniel Trebbien: "Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 2)"
Reply: Daniel Shahaf: "Re: [PATCH] allow svnsync to translate non-UTF-8 log messages to UTF-8 (v. 2)"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]