[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] v4 #3460

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Thu, 28 Jan 2010 12:53:08 +0000

Hi Daniel.

Thanks for fixing the doc strings.

Re. the spam email used as test data - nice idea, but it kind of offends
my eyes to see such junk on my screen :-)

> +/* Indicate in *MATCHED whether the original text of HUNK matches the patch
> + * TARGET at its current line. Lines at FUZZ offset from start or end of the

"Lines within FUZZ lines of the start or end ..."

> + * hunk will always match. When this function returns, neither
> + * TARGET->CURRENT_LINE nor the file offset in the target file will have
> + * changed. HUNK->ORIGINAL_TEXT will be reset. Do temporary allocations in
> + * POOL. */
> static svn_error_t *
> match_hunk(svn_boolean_t *matched, patch_target_t *target,
> - const svn_hunk_t *hunk, apr_pool_t *pool)
> + const svn_hunk_t *hunk, int fuzz, apr_pool_t *pool)

> /* Scan lines of TARGET for a match of the original text of HUNK,
> * up to but not including the specified UPPER_LINE.
> + * Use fuzz factor FUZZ everywhere if possible.

I would drop the "if possible". It implies that this function decides
not to use the fuzz factor in some cases.

> * If UPPER_LINE is zero scan until EOF occurs when reading from TARGET.
> * Return the line at which HUNK was matched in *MATCHED_LINE.
> * If the hunk did not match at all, set *MATCHED_LINE to zero.
> @@ -670,7 +688,7 @@
> static svn_error_t *
> scan_for_match(svn_linenum_t *matched_line, patch_target_t *target,
> const svn_hunk_t *hunk, svn_boolean_t match_first,
> - svn_linenum_t upper_line, apr_pool_t *pool)
> + svn_linenum_t upper_line, int fuzz, apr_pool_t *pool)

> /* Copy HUNK_TEXT into TARGET, line by line, such that the line filter
> * and transformation callbacks set on HUNK_TEXT by the diff parsing
> * code in libsvn_diff will trigger. ABSPATH is the absolute path to the
> - * file underlying TARGET. */
> + * file underlying TARGET. Do not copy the lines that is within FUZZ offset
> + * from the beginning or end of hunk unless NR_OF_LINES is set to 0. If
> + * NR_OF_LINES is non-zero, it represents the number of lines in HUNK_TEXT. */
> static svn_error_t *
> copy_hunk_text(svn_stream_t *hunk_text, svn_stream_t *target,
> - const char *abspath, apr_pool_t *scratch_pool)
> + const char *abspath, int fuzz, int nr_of_lines,
> + apr_pool_t *scratch_pool)

This looks odd. Before now, the function worked without needing to be
told the number of lines in the hunk, and now it needs to be told, but
that means it now has two ways of knowing. Looking at the
implementation, I can see why you want this, but from an interface point
of view it would be better if it could still work without being told. (I
can envisage a way of doing that - by keeping the last FUZZ lines that
have been read in a circular buffer before writing them out, so that at
the end of the input you can discard the FUZZ lines that are in the
buffer.)

But let's say you want to keep the simple implementation that needs to
know NR_OF_LINES in advance. Why have two ways to request non-fuzzy
operation: the obvious way (fuzz=0) or the special way (nr_of_lines=0)?
I don't think you need that special exception "unless NR_OF_LINES is 0".
Can you keep the rules simple and delete that special case?

> > Here it looks like there should be a parameter called FUZZ but there
> > isn't.
>
> hunk_info_t contains a field called fuzz. [...]

OK - it was just because FUZZ in capital letters looked like it was
meant to refer to a function parameter. You could write "HI->fuzz"
instead.

> Index: subversion/libsvn_diff/parse-diff.c
> ===================================================================
> --- subversion/libsvn_diff/parse-diff.c (revision 903978)
> +++ subversion/libsvn_diff/parse-diff.c (arbetskopia)
> @@ -278,20 +281,45 @@
> }
>
> c = line->data[0];
> - if (original_lines > 0 && (c == ' ' || c == '-'))
> + if (original_lines > 0 && c == ' ' )
> {
> hunk_seen = TRUE;
> original_lines--;
> + if (changed_line_seen)
> + trailing_context++;
> + else
> + leading_context++;
> }
> + else if (original_lines > 0 && c == '-')
> + {
> + hunk_seen = TRUE;
> + original_lines--;
> + changed_line_seen = TRUE;
> +
> + /* A hunk may have context in the middle. We only want the
> + last lines of context. */
> + if (trailing_context > 0)
> + trailing_context = 0;
> + }
> else if (c == '+')
> {
> hunk_seen = TRUE;
> + changed_line_seen = TRUE;
> +
> + /* A hunk may have context in the middle. We only want the
> + last lines of context. */
> + if (trailing_context > 0)
> + trailing_context = 0;
> }
> /* Tolerate chopped leading spaces on empty lines. */
> else if (original_lines > 0 && ! eof && line->len == 0)
> {
> hunk_seen = TRUE;
> original_lines--;
> + if (changed_line_seen)
> + trailing_context++;
> + else
> + leading_context++;
> }

This fourth block would neatly combine with the first block:

if (original_lines > 0 && (c == ' ' || (! eof && line->len == 0))
  { ... }

- Julian
Received on 2010-01-28 13:53:58 CET

This is an archived mail posted to the Subversion Dev mailing list.