Re: [PATCH] issue 1493 - use libsvn_diff for diffing props

From: Stefan Sperling <stsp_at_elego.de>
Date: Mon, 3 Aug 2009 18:23:51 +0100

On Mon, Jul 27, 2009 at 12:48:27PM +0200, Daniel Näslund wrote:
> Attaching my new patch with the proposed changes from stsp.

Hi Daniel,

I've attached an updated patch this time which includes my suggested
changes. Please check my version of the patch to see if you are happy
with it.

Every change I made is documented in the comments below.

> Index: subversion/libsvn_diff/diff_memory.c
> ==================================================================--- subversion/libsvn_diff/diff_memory.c (revision 38477)
> +++ subversion/libsvn_diff/diff_memory.c (arbetskopia)
> @@ -426,7 +426,8 @@
> /* Flush the hunk currently built up in BATON
> into the baton's output_stream */

Should also update the doc string to mention the new parameter.

> static svn_error_t *
> -output_unified_flush_hunk(output_baton_t *baton)
> +output_unified_flush_hunk(output_baton_t *baton,
> + const char *hunk_delimiter)

Wrong indentation here. Please align all parameters at the same column.

> {
> apr_off_t target_token;
> apr_size_t hunk_len;
> @@ -453,23 +454,37 @@
> /* Hunk length 1 is implied, don't show the
> length field if we have a hunk that long */
> (baton->hunk_length[0] == 1)

For functions, we use no space (including newlines) before parentheses.
This seems to be old code which hadn't been updated to this convention.

> - ? ("@@ -%" APR_OFF_T_FMT)
> - : ("@@ -%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT),
> + ? ("%s -%" APR_OFF_T_FMT)
> + : ("%s -%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT),
> + hunk_delimiter,
> baton->hunk_start[0], baton->hunk_length[0]));
>
> if (baton->hunk_length[1] > 0)
> /* Convert our 0-based line numbers into unidiff 1-based numbers */
> baton->hunk_start[1]++;
> - SVN_ERR(svn_stream_printf_from_utf8
> +

The function should fall back to the default hunk delimiter
if none is supplied.

> +
> + /* Hunk length 1 is implied, don't show the
> + length field if we have a hunk that long */
> + if (baton->hunk_length[1] == 1)
> + {
> + SVN_ERR(svn_stream_printf_from_utf8

Again, no space before paren.

> (baton->output_stream, baton->header_encoding,
> baton->pool,
> - /* Hunk length 1 is implied, don't show the
> - length field if we have a hunk that long */
> - (baton->hunk_length[1] == 1)
> - ? (" +%" APR_OFF_T_FMT " @@" APR_EOL_STR)
> - : (" +%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT " @@" APR_EOL_STR),
> - baton->hunk_start[1], baton->hunk_length[1]));
> + " +%" APR_OFF_T_FMT " %s" APR_EOL_STR,
> + baton->hunk_start[1], hunk_delimiter));
> + }
> + else
> + {
> + SVN_ERR(svn_stream_printf_from_utf8

Same.

> + (baton->output_stream, baton->header_encoding,
> + baton->pool,
> + " +%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT " %s" APR_EOL_STR,
> + baton->hunk_start[1], baton->hunk_length[1],
> + hunk_delimiter));
>
> + }
> +
> hunk_len = baton->hunk->len;
> SVN_ERR(svn_stream_write(baton->output_stream,
> baton->hunk->data, &hunk_len));
> @@ -498,7 +513,7 @@
> targ_mod = modified_start;
>
> if (btn->next_token + SVN_DIFF__UNIFIED_CONTEXT_SIZE < targ_orig)
> - SVN_ERR(output_unified_flush_hunk(btn));
> + SVN_ERR(output_unified_flush_hunk(btn, "@@"));

With output_unified_flush_hunk falling back to a reasonable default,
we can now just pass NULL instead of "@@" here. The default is now
set at only one location in the code.

>
> if (btn->hunk_length[0] == 0
> && btn->hunk_length[1] == 0)
> @@ -530,8 +545,10 @@
>
>
> svn_error_t *
> -svn_diff_mem_string_output_unified(svn_stream_t *output_stream,
> +svn_diff_mem_string_output_unified2(svn_stream_t *output_stream,
> svn_diff_t *diff,
> + svn_boolean_t with_diff_header,

Parameters should follow the indentation change.

> + const char *hunk_delimiter,
> const char *original_header,
> const char *modified_header,
> const char *header_encoding,
> @@ -563,23 +580,50 @@
> fill_source_tokens(&baton.sources[0], original, pool);
> fill_source_tokens(&baton.sources[1], modified, pool);
>
> - SVN_ERR(svn_stream_printf_from_utf8
> - (output_stream, header_encoding, pool,
> - "--- %s" APR_EOL_STR
> - "+++ %s" APR_EOL_STR,
> - original_header, modified_header));
> + if (with_diff_header) {

The indentation here does not match our coding style guidelines.

> + SVN_ERR(svn_stream_printf_from_utf8

No space before paren. I've also tweaked the indentation a bit here.

> + (output_stream, header_encoding, pool,
> + "--- %s" APR_EOL_STR
> + "+++ %s" APR_EOL_STR,
> + original_header, modified_header));
> + }
>
> SVN_ERR(svn_diff_output(diff, &baton,
> &mem_output_unified_vtable));
> - SVN_ERR(output_unified_flush_hunk(&baton));
>
> + SVN_ERR(output_unified_flush_hunk(&baton, hunk_delimiter));
> +
> svn_pool_destroy(baton.pool);
> }
>
> return SVN_NO_ERROR;
> }
>
> +svn_error_t *
> +svn_diff_mem_string_output_unified(svn_stream_t *output_stream,
> + svn_diff_t *diff,
> + const char *original_header,
> + const char *modified_header,
> + const char *header_encoding,
> + const svn_string_t *original,
> + const svn_string_t *modified,
> + apr_pool_t *pool)
> +{
> + SVN_ERR(svn_diff_mem_string_output_unified2(
> + output_stream,

Please align all parameters on the same column where the function
name ends, wherever possible, like so:

SVN_ERR(svn_diff_mem_string_output_unified2(output_stream,
diff,
...

> + diff,
> + TRUE,
> + "@@",

Again, now passing NULL.

> + original_header,
> + modified_header,
> + header_encoding,
> + original,
> + modified,
> + pool));
> + return SVN_NO_ERROR;
> +}
>
> +
>
> /* diff3 merge output */
>
> Index: subversion/include/svn_diff.h
> ==================================================================--- subversion/include/svn_diff.h (revision 38477)
> +++ subversion/include/svn_diff.h (arbetskopia)
> @@ -667,15 +667,36 @@
> const svn_diff_file_options_t *options,
> apr_pool_t *pool);
>
> -
> /** Outputs the @a diff object generated by svn_diff_mem_string_diff()
> * in unified diff format on @a output_stream, using @a original
> * and @a modified for the text in the output.
> - * Outputs the header and markers in @a header_encoding.
> + * Only if @a with_diff_header is TRUE it outputs the header. Useful for
> + * property diffs.

I've tweaked this slightly. There's no need to mention one particular
case of application. You wouldn't say "useful for checking if the
string 'abc' matches the string 'abc'" in documentation of strcmp() either.

> *
> + * Outputs the header and markers in @a header_encoding with the specified
> + * @a hunk_delimiters.
> + *
> * @a original_header and @a modified header are
> * used to fill the field after the "---" and "+++" header markers.
> *
> + * @since New in 1.7.
> + */
> +svn_error_t *
> +svn_diff_mem_string_output_unified2(svn_stream_t *output_stream,
> + svn_diff_t *diff,
> + svn_boolean_t with_diff_header,
> + const char *hunk_delimiter,
> + const char *original_header,
> + const char *modified_header,

Parameters should follow the indentation change.

> + const char *header_encoding,
> + const svn_string_t *original,
> + const svn_string_t *modified,
> + apr_pool_t *pool);
> +
> +/** Similar to svn_diff_mem_string_output_unified2() but with @a
> + * with_diff_headers always set to TRUE and @a hunk_delimiter always
> + * set to "@@".

Docstring slightly tweaked, see attached patch.

> + *
> * @since New in 1.5.
> */
> svn_error_t *
> Index: subversion/libsvn_client/diff.c
> ==================================================================--- subversion/libsvn_client/diff.c (revision 38477)
> +++ subversion/libsvn_client/diff.c (arbetskopia)
> @@ -49,6 +49,7 @@
> #include "svn_time.h"
> #include "svn_sorts.h"
> #include "svn_base64.h"
> +#include "svn_subst.h"
> #include "client.h"
>
> #include "private/svn_wc_private.h"
> @@ -167,6 +168,34 @@
> _("Path '%s' must be an immediate child of " \
> "the directory '%s'"), path, relative_to_dir)
>
> +/* A helper func used by display_prop_diffs. It appends the systems default
> + eol character if there is none in the string.
> + token is the string holding the property value. The return value is allocated
> + from pool.*/

I've renamed this function to maybe_append_eol() and rewritten its
docstring. I think the docstring is now more accurate and describes
the behaviour in more detail. Do you think my description of the
function is correct?

> +static const svn_string_t *
> +append_eol(const svn_string_t *token, apr_pool_t *pool)
> +{
> + const char *curp;
> +
> + if (token->len == 0)
> + return token;
> +
> + curp = token->data + token->len - 1;
> + if (*curp == '\r')
> + {

The indentation here does not conform to our coding style guidelines.

> + return token;
> + }
> + else if (*curp != '\n')
> + {
> + const char *tmp = apr_psprintf(pool, "%s%s", token->data, APR_EOL_STR);
> + return svn_string_create(tmp, pool);

We can avoid using a temporary variable by using svn_string_createf().

I would like to avoid manual parsing of EOL strings here.
But we need a good set of APIs for that first.
Edmund Wong is looking into that.

> + }
> + else
> + {
> + return token;
> + }
> +}
> +
> /* A helper func that writes out verbal descriptions of property diffs
> to FILE. Of course, the apr_file_t will probably be the 'outfile'
> passed to svn_client_diff5, which is probably stdout. */
> @@ -242,53 +271,39 @@
> continue;
> }
>
> - /* For now, we have a rather simple heuristic: if this is an
> - "svn:" property, then assume the value is UTF-8 and must
> - therefore be converted before printing. Otherwise, just
> - print whatever's there and hope for the best. */
> {
> - svn_boolean_t val_is_utf8 = svn_prop_is_svn_prop(propchange->name);
> + /* Use libsvn_diff for output.*/

Superfluous comment.

> + svn_stream_t *os = svn_stream_from_aprfile2(file, TRUE, pool);
> + svn_diff_t *diff;
> + svn_diff_file_options_t options;
> + const svn_string_t *tmp;
>
> - if (original_value != NULL)
> - {
> - if (val_is_utf8)
> - {
> - SVN_ERR(file_printf_from_utf8
> - (file, encoding,
> - " - %s" APR_EOL_STR, original_value->data));
> - }
> - else
> - {
> - /* ### todo: check for error? */
> - apr_file_printf
> - (file, " - %s" APR_EOL_STR, original_value->data);
> - }
> - }
> + /* The last character in a property is often not a newline.
> + Since the diff is not useful anyway for patching properties an
> + eol character is appended when needed to remove those pescious
> + ' \ No newline at end of file' lines. */
> + tmp = original_value ? original_value : svn_string_create("", pool);
>
> - if (propchange->value != NULL)
> - {
> - if (val_is_utf8)
> - {
> - SVN_ERR(file_printf_from_utf8
> - (file, encoding, " + %s" APR_EOL_STR,
> - propchange->value->data));
> - }
> - else
> - {
> - /* ### todo: check for error? */
> - apr_file_printf(file, " + %s" APR_EOL_STR,
> - propchange->value->data);
> - }
> - }
> + const svn_string_t *orig = append_eol(tmp, pool);

We can't declare variables in the middle of a scope.
This probably won't compile on Windows.

> +
> + tmp = propchange->value ? propchange->value :

Please use a single space after '?'.

> + svn_string_create("", pool);
> +
> + const svn_string_t *val = append_eol(tmp, pool);

Can't declare variables in the middle of a scope.

> +
> + SVN_ERR(svn_diff_mem_string_diff(&diff, orig, val, &options, pool));
> +
> + /* The diff uses '##' as delimiters for the hunk sections and does not
> + * display a diff header. */
> + SVN_ERR(svn_diff_mem_string_output_unified2(os, diff, TRUE, "##",

This comment does not explain why we are using '##' instead of the default.
I've changed this to explain the reason.

Also, your code does not match the comment, because with_diff_header is
set to TRUE.

> + svn_dirent_local_style(path, pool),
> + svn_dirent_local_style(path, pool),
> + encoding, orig, val, pool));
> + SVN_ERR(svn_stream_close(os));
> +
> }
> }
>
> - /* ### todo [issue #1533]: Use file_printf_from_utf8() to convert this
> - to native encoding, at least conditionally? Or is it better to
> - have under_string always output the same eol, so programs can
> - find it consistently? Also, what about checking for error? */
> - apr_file_printf(file, APR_EOL_STR);
> -
> return SVN_NO_ERROR;
> }
>

I've slightly tested the attached patch now and produces splendid
diffs for multi-line property values. Thanks a lot! :)

In the future, please take more care about indentation. It takes
a lot of time to fix it up, and the less time your patches take
to process, the quicker they will get committed.

Before we commit this, as a final verification step, can you take the
time to run 'make check' on the patch and clean up any fallout in the
test suite? Or have you run the regression tests on this already?

Thanks,
Stefan

text/x-diff attachment: issue-1493-v2.diff

Received on 2009-08-03 19:24:24 CEST

This message: [ Message body ]
Next message: Auke Jilderda: "unsubscribe"
Previous message: Stefan Sperling: "EOL-functions"
Next in thread: Daniel NÃ¤slund: "Changed order when diffing with --depth option"
Reply: Daniel NÃ¤slund: "Changed order when diffing with --depth option"
Reply: Daniel NÃ¤slund: "[PATCH v3] issue 1493 - use libsvndiff for diffing properties"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]