[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] issue 1493 - use libsvn_diff for diffing props

From: Stefan Sperling <stsp_at_elego.de>
Date: Fri, 24 Jul 2009 20:13:34 +0100

On Fri, Jul 24, 2009 at 08:07:12PM +0200, Daniel Nslund wrote:
> Hey!
>
> This is my first attempt as a Subversion coder. Please be nice to me!
>
> It works for patch(1) but I haven't tried it on any other patch programs.
>
> Some background on the subject:
> http://subversion.tigris.org/ds/viewMessage.do?dsMessageId=977146&dsForumId=462
>
> / Daniel Nslund
> [[[
> Fix part of issue 1493: Property diffs/merge should use libsvn_diff.
> For properties I have removed the diff headers and replaced the @@-characters
> with ##.
>
> *subversion/libsvn_diff/diff_memory.c
> (output_unified_flush_hunk2): Added parameter for choosing delimiter, for
> instance '##' instead of '@@'.
>
> *subversion/libsvn_diff/diff_memory.c
> (svn_diff_mem_string_output_unified2): Added parameter for choosing if the diff
> should be formatted as a property diff, '##' as delimiters.
>
> *subversion/include/svn_diff.h
> (output_unified_flush_hunk2): Declaration
> (svn_diff_mem_string_output_unified2): Declaration
>
> *subversion/libsvn_client/diff.c
> (append_eol): Appends an end of line character if missing.
>
> *subversion/libsvn_client/diff.c
> (display_prop_diffs): Use libsvn_diff for diffing properties.
> ]]]
>
> Index: subversion/libsvn_diff/diff_memory.c
> ===================================================================
> --- subversion/libsvn_diff/diff_memory.c (revision 38477)
> +++ subversion/libsvn_diff/diff_memory.c (arbetskopia)
> @@ -426,7 +426,8 @@
> /* Flush the hunk currently built up in BATON
> into the baton's output_stream */
> static svn_error_t *
> -output_unified_flush_hunk(output_baton_t *baton)
> +output_unified_flush_hunk2(output_baton_t *baton,
> + const char *hunk_delimiter)

You don't need to rev static functions.
Only public API (i.e. anything in subversion/include/*h but not
subversion/include/private/ or anywhere else) needs to stay
backwards compatible.

So just change the static function to your liking.

> {
> apr_off_t target_token;
> apr_size_t hunk_len;
> @@ -453,23 +454,37 @@
> /* Hunk length 1 is implied, don't show the
> length field if we have a hunk that long */
> (baton->hunk_length[0] == 1)
> - ? ("@@ -%" APR_OFF_T_FMT)
> - : ("@@ -%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT),
> + ? ("%s -%" APR_OFF_T_FMT)
> + : ("%s -%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT),
> + hunk_delimiter,
> baton->hunk_start[0], baton->hunk_length[0]));
>
> if (baton->hunk_length[1] > 0)
> /* Convert our 0-based line numbers into unidiff 1-based numbers */
> baton->hunk_start[1]++;
> - SVN_ERR(svn_stream_printf_from_utf8
> +
> +
> + /* Hunk length 1 is implied, don't show the
> + length field if we have a hunk that long */
> + if (baton->hunk_length[1] == 1)
> + {
> + SVN_ERR(svn_stream_printf_from_utf8
> (baton->output_stream, baton->header_encoding,
> baton->pool,
> - /* Hunk length 1 is implied, don't show the
> - length field if we have a hunk that long */
> - (baton->hunk_length[1] == 1)
> - ? (" +%" APR_OFF_T_FMT " @@" APR_EOL_STR)
> - : (" +%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT " @@" APR_EOL_STR),
> - baton->hunk_start[1], baton->hunk_length[1]));
> + " +%" APR_OFF_T_FMT " %s" APR_EOL_STR,
> + baton->hunk_start[1], hunk_delimiter));
> + }
> + else
> + {
> + SVN_ERR(svn_stream_printf_from_utf8
> + (baton->output_stream, baton->header_encoding,
> + baton->pool,
> + " +%" APR_OFF_T_FMT ",%" APR_OFF_T_FMT " %s" APR_EOL_STR,
> + baton->hunk_start[1], baton->hunk_length[1],
> + hunk_delimiter));
>
> + }
> +
> hunk_len = baton->hunk->len;
> SVN_ERR(svn_stream_write(baton->output_stream,
> baton->hunk->data, &hunk_len));
> @@ -480,6 +495,16 @@
> return SVN_NO_ERROR;
> }
>
> +/* Flush the hunk currently built up in BATON
> + into the baton's output_stream */
> +static svn_error_t *
> +output_unified_flush_hunk(output_baton_t *baton)
> +{

I.e. this isn't needed.

> + SVN_ERR(output_unified_flush_hunk2(baton, "@@"));
> +
> + return SVN_NO_ERROR;
> +}
> +
> /* Implements svn_diff_output_fns_t::output_diff_modified */
> static svn_error_t *
> output_unified_diff_modified(void *baton,
> @@ -530,8 +555,9 @@
>
>
> svn_error_t *
> -svn_diff_mem_string_output_unified(svn_stream_t *output_stream,
> +svn_diff_mem_string_output_unified2(svn_stream_t *output_stream,

This _is_ needed, however :)

You can recognize public API also by its svn_ prefix (with a single
underscore).

> svn_diff_t *diff,
> + svn_boolean_t property_diff,

svn_diff_mem_string_output_unified2() can be used on any content,
so I think the name 'property_diff' is too specific.

Can we call it 'with_diff_header' or something like that?

> const char *original_header,
> const char *modified_header,
> const char *header_encoding,
> @@ -563,15 +589,20 @@
> fill_source_tokens(&baton.sources[0], original, pool);
> fill_source_tokens(&baton.sources[1], modified, pool);
>
> - SVN_ERR(svn_stream_printf_from_utf8
> - (output_stream, header_encoding, pool,
> - "--- %s" APR_EOL_STR
> - "+++ %s" APR_EOL_STR,
> - original_header, modified_header));
> + if (!property_diff) {
> + SVN_ERR(svn_stream_printf_from_utf8
> + (output_stream, header_encoding, pool,
> + "--- %s" APR_EOL_STR
> + "+++ %s" APR_EOL_STR,
> + original_header, modified_header));
> + }
>
> SVN_ERR(svn_diff_output(diff, &baton,
> &mem_output_unified_vtable));
> - SVN_ERR(output_unified_flush_hunk(&baton));
> + if (property_diff)
> + SVN_ERR(output_unified_flush_hunk2(&baton, "##"));

Can we pass the hunk header delimiters in as a parameter, too?
Just like you did in output_unified_flush_hunk2()?
Let the caller decide what it should be.

> + else
> + SVN_ERR(output_unified_flush_hunk(&baton));
>
> svn_pool_destroy(baton.pool);
> }
> @@ -579,7 +610,30 @@
> return SVN_NO_ERROR;
> }
>
> +svn_error_t *
> +svn_diff_mem_string_output_unified(svn_stream_t *output_stream,
> + svn_diff_t *diff,
> + const char *original_header,
> + const char *modified_header,
> + const char *header_encoding,
> + const svn_string_t *original,
> + const svn_string_t *modified,
> + apr_pool_t *pool)
> +{
> + SVN_ERR(svn_diff_mem_string_output_unified2(
> + output_stream,
> + diff,

> + FALSE,

If you agree with changing the name of the parameter (and hence its
meaning) this FALSE might need to be toggled.

> + original_header,
> + modified_header,
> + header_encoding,
> + original,
> + modified,
> + pool));
> + return SVN_NO_ERROR;
> +}
>
> +
>
> /* diff3 merge output */
>
> Index: subversion/include/svn_diff.h
> ===================================================================
> --- subversion/include/svn_diff.h (revision 38477)
> +++ subversion/include/svn_diff.h (arbetskopia)
> @@ -667,6 +667,29 @@
> const svn_diff_file_options_t *options,
> apr_pool_t *pool);
>
> +/** Outputs the @a diff object generated by svn_diff_mem_string_diff()
> + * in unified diff format on @a output_stream, using @a original

Instead of adding an entirely new declaration of the rev'd
function, just edit the existing docstring and declaration
and provide a new declaration of the old version, which should
only describe in its docstring how it differs from its successor.

> + * and @a modified for the text in the output.
> + * Only if @a display_header is TRUE it outputs the header. Useful for
> + * property diffs.

Hey here you're already using a better name :)

> + * Outputs the header and markers in @a header_encoding.
> + *

> + *
One empty line is enough spacing, two aren't needed (but meh).

> + * @a original_header and @a modified header are
> + * used to fill the field after the "---" and "+++" header markers.
> + *
> + * @since New in 1.7.
> + */
> +svn_error_t *
> +svn_diff_mem_string_output_unified2(svn_stream_t *output_stream,
> + svn_diff_t *diff,
> + svn_boolean_t display_headers,
> + const char *original_header,
> + const char *modified_header,
> + const char *header_encoding,
> + const svn_string_t *original,
> + const svn_string_t *modified,
> + apr_pool_t *pool);
>
> /** Outputs the @a diff object generated by svn_diff_mem_string_diff()

I.e. here, you'd say "Like svn_diff_mem_string_diff2(), except ..."

> * in unified diff format on @a output_stream, using @a original
> Index: subversion/libsvn_client/diff.c
> ===================================================================
> --- subversion/libsvn_client/diff.c (revision 38477)
> +++ subversion/libsvn_client/diff.c (arbetskopia)
> @@ -167,6 +167,32 @@
> _("Path '%s' must be an immediate child of " \
> "the directory '%s'"), path, relative_to_dir)
>
> +/* A helper func used by display_prop_diffs. It appends the systems default
> + eol character if there is none in the string. */

In this docstring, mention TOKEN and POOL and explain what they are
and what they are used for. POOL almost always gets the same
explanation, look around for examples.

> +static const svn_string_t *
> +append_eol(const svn_string_t *token, apr_pool_t *pool)
> +{
> + const char *curp;
> +
> + if (token->len == 0)
> + return token;
> +
> + curp = token->data + token->len - 1;

Please use svn_subst_detect_eol() if possible.
See include/svn_subst.h.

> + if (*curp == '\r')
> + {
> + return token;
> + }
> + else if (*curp != '\n')
> + {
> + const char *tmp = apr_psprintf(pool, "%s%s", token->data, APR_EOL_STR);
> + return svn_string_create(tmp, pool);
> + }
> + else
> + {
> + return token;
> + }
> +}
> +
> /* A helper func that writes out verbal descriptions of property diffs
> to FILE. Of course, the apr_file_t will probably be the 'outfile'
> passed to svn_client_diff5, which is probably stdout. */
> @@ -242,53 +268,38 @@
> continue;
> }
>
> - /* For now, we have a rather simple heuristic: if this is an
> - "svn:" property, then assume the value is UTF-8 and must
> - therefore be converted before printing. Otherwise, just
> - print whatever's there and hope for the best. */
> {
> - svn_boolean_t val_is_utf8 = svn_prop_is_svn_prop(propchange->name);
> + /* Use libsvn_diff for output.*/
> + svn_stream_t *os = svn_stream_from_aprfile2(file, TRUE, pool);
> + svn_diff_t *diff;
> + svn_diff_file_options_t options;
> + const svn_string_t *tmp;
>
> - if (original_value != NULL)
> - {
> - if (val_is_utf8)
> - {
> - SVN_ERR(file_printf_from_utf8
> - (file, encoding,
> - " - %s" APR_EOL_STR, original_value->data));
> - }
> - else
> - {
> - /* ### todo: check for error? */
> - apr_file_printf
> - (file, " - %s" APR_EOL_STR, original_value->data);
> - }
> - }

Glad to see the above block of code go, it's just silly.

> + /* When setting or editing a property no eol character is inserted.

What do you mean here? Do you mean "the property value won't ever
contain any newlines"? Or do you mean "the last character of the property
value won't be a newline"?

> + Since the diff is not useful anyway for patching properties an
> + eol character is appended to remove those pescious ' \ No newline
> + at end of file' lines. */
> + tmp = original_value ? original_value : svn_string_create("", pool);
>
> - if (propchange->value != NULL)
> - {
> - if (val_is_utf8)
> - {
> - SVN_ERR(file_printf_from_utf8
> - (file, encoding, " + %s" APR_EOL_STR,
> - propchange->value->data));
> - }
> - else
> - {
> - /* ### todo: check for error? */
> - apr_file_printf(file, " + %s" APR_EOL_STR,
> - propchange->value->data);
> - }
> - }
> + const svn_string_t *orig = append_eol(tmp, pool);
> +
> + tmp = propchange->value ? propchange->value :
> + svn_string_create("", pool);
> +
> + const svn_string_t *val = append_eol(tmp, pool);
> +
> + SVN_ERR(svn_diff_mem_string_diff(&diff, orig, val, &options, pool));
> +
> + /* The diff is printed with special property settings */

Here is the place where the caller could specify what the hunk headers
will look like.

> + SVN_ERR(svn_diff_mem_string_output_unified2(os, diff, TRUE,
> + svn_dirent_local_style(path, pool),
> + svn_dirent_local_style(path, pool),
> + encoding, orig, val, pool));
> + SVN_ERR(svn_stream_close(os));
> +
> }
> }

Glad to see your patch, looks great overall.

Thanks,
Stefan

>
> - /* ### todo [issue #1533]: Use file_printf_from_utf8() to convert this
> - to native encoding, at least conditionally? Or is it better to
> - have under_string always output the same eol, so programs can
> - find it consistently? Also, what about checking for error? */
> - apr_file_printf(file, APR_EOL_STR);
> -
> return SVN_NO_ERROR;
> }
>
> > ------------------------------------------------------
> > http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2375349
>
Received on 2009-07-24 21:14:04 CEST

This is an archived mail posted to the Subversion Dev mailing list.