Re: issue 3459 - svn patch does not tolerate empty lines

From: Stefan Sperling <stsp_at_elego.de>
Date: Fri, 7 Aug 2009 13:23:51 +0100

On Fri, Aug 07, 2009 at 01:40:10PM +0200, Daniel Näslund wrote:
> On Thu, Aug 06, 2009 at 01:10:43PM +0100, Stefan Sperling wrote:
> > The parser right now assumes that every line which is part
> > of a hunk starts with '+', '-', or ' '.
> >
> > This works for correctly formatted diffs as output by 'svn diff'
> > and 'diff', but not for mangled diffs.
> >
>
> My second attempt. I allow empty lines to be included in the hunk. And
> when reading the hunk in subversion/libsvn_client/patch.c (match_hunk) I
> insert a space if missing.
>
> It would be better to shield libsvn_client from the empty lines handling
> and insert an extra space when needed in the streams in
> subversion/libsvn_diff/parse-diff.c (svn_diff__parse_next_hunk. Should I
> do that instead? I do miss Java IO (thought I would never say that).
>
> Index: subversion/libsvn_diff/parse-diff.c
> ===================================================================
> --- subversion/libsvn_diff/parse-diff.c (revision 38611)
> +++ subversion/libsvn_diff/parse-diff.c (arbetskopia)
> @@ -358,8 +358,11 @@
> }
>
> c = line->data[0];
> - if (c == ' ' || c == '-' || c == '+')
> - hunk_seen = TRUE;
> +
> + /* It is allowed for a line to be empty. */
> + if (c == ' ' || c == '-' || c == '+'
> + || (! eof && line->len == 0))
> + hunk_seen = TRUE;
> else
> {
> in_hunk = FALSE;
> Index: subversion/libsvn_client/patch.c
> ===================================================================
> --- subversion/libsvn_client/patch.c (revision 38611)
> +++ subversion/libsvn_client/patch.c (arbetskopia)
> @@ -2108,6 +2108,13 @@
> target->patch->eol_str, &hunk_eof, iterpool));
> SVN_ERR(svn_stream_readline(target->stream, &target_line, target->eol_str,
> &target->eof, iterpool));
> +
> + /* If leading spaces has been dropped, the patch should still be
> + * applied. Started out as an issue with the mailing software
> + * tigris that for some reason ate extra spaces. */

Just say that UNIX patch accepts this, too. There's no need to bash
tigris in comments in our code. It's getting enough bashing elsewhere as is.

> + if (hunk_line->len == 0)
> + svn_stringbuf_appendcstr(hunk_line, " ");
> +
> if (! hunk_eof && hunk_line->len > 0)
> {
> char c = hunk_line->data[0];
>
>
> > Maybe libsvn_client should be shielded from the unidiff specifics
> > entirely, and just get original and modified hunk lines from libsvn_diff
> > with the extra '+', '-' etc. already removed.
>
> Involves changing svn_hunk_t to use some other classification than
> original_text and modified_text? Beeing lazy I would prefer not having
> to do that. :-)

It may involve changing that.
I was hoping you'd figure out what's needed :-P

I'd still prefer a solution where libsvn_client conceptually
ends up in a situation like:

  original_text = "foo"
  modified_text = "bar"
  latest_text = "foo"

which would be much more easy to deal with instead of the current:

  original_text = "-foo"
  modified_text = "+bar"
  latest_text = "foo"

Right now, the information about a pattern being original or modified
is encoded twice, both in the variable name and the first character
(except for latest_text which needs special-casing). libsvn_client
doesn't really care about the plusses and minusses. It does the text
string matching, not the parsing.

Say we'd want to add a parser for context diffs (diff -c) at some point.
That would be easier if libsvn_client was already shielded from the
specifics of the unidiff format.

> Now I am able to patch the simple example given in the issue description
> but when trying to create a file using this patch I get a conflict
> (There is one empty line at the end of the patch) :
>
> Index: foo2
> ===================================================================
> --- foo2 (revision 0)
> +++ foo2 (revision 0)
> @@ -0,0 +1 @@
> +test
>
> Is this an issue with the fuzz? If svn patch would allow at least one
> line of fuzz then this patch would work?
>
> Or do I need a way to find out when an empty line is there for
> readability (as the blank line before Property changes on ...) or when
> the leading space has been chopped?

There are two cases:

  1) trailing empty line is part of the hunk's context
  2) trailing empty line is part of the noise surrounding patches
     in the unidiff

I guess we can assume that the number of lines of leading context
in a hunk is equal to the number of lines of trailing context in
a hunk. That should provide a way to resolve the ambiguity.

Stefan

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2381272
Received on 2009-08-07 14:24:16 CEST

This message: [ Message body ]
Next message: Paul Burba: "Re: Merging the subtree-mergeinfo branch back to trunk"
Previous message: Hyrum K. Wright: "Re: [PATCH]remove adm_access in svn_wc_translated_file3"
In reply to: Daniel Näslund: "Re: issue 3459 - svn patch does not tolerate empty lines"
Next in thread: Daniel Näslund: "[PATCH] issue 3459 - svn patch does not tolerate empty lines"
Reply: Daniel Näslund: "[PATCH] issue 3459 - svn patch does not tolerate empty lines"
Reply: Daniel Näslund: "[PATCH] char_filter_callback for streams"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]