On Fri, Aug 07, 2009 at 01:40:10PM +0200, Daniel Näslund wrote:
> On Thu, Aug 06, 2009 at 01:10:43PM +0100, Stefan Sperling wrote:
> > The parser right now assumes that every line which is part
> > of a hunk starts with '+', '-', or ' '.
> > This works for correctly formatted diffs as output by 'svn diff'
> > and 'diff', but not for mangled diffs.
> My second attempt. I allow empty lines to be included in the hunk. And
> when reading the hunk in subversion/libsvn_client/patch.c (match_hunk) I
> insert a space if missing.
> It would be better to shield libsvn_client from the empty lines handling
> and insert an extra space when needed in the streams in
> subversion/libsvn_diff/parse-diff.c (svn_diff__parse_next_hunk. Should I
> do that instead? I do miss Java IO (thought I would never say that).
> Index: subversion/libsvn_diff/parse-diff.c
> --- subversion/libsvn_diff/parse-diff.c (revision 38611)
> +++ subversion/libsvn_diff/parse-diff.c (arbetskopia)
> @@ -358,8 +358,11 @@
> c = line->data;
> - if (c == ' ' || c == '-' || c == '+')
> - hunk_seen = TRUE;
> + /* It is allowed for a line to be empty. */
> + if (c == ' ' || c == '-' || c == '+'
> + || (! eof && line->len == 0))
> + hunk_seen = TRUE;
> in_hunk = FALSE;
> Index: subversion/libsvn_client/patch.c
> --- subversion/libsvn_client/patch.c (revision 38611)
> +++ subversion/libsvn_client/patch.c (arbetskopia)
> @@ -2108,6 +2108,13 @@
> target->patch->eol_str, &hunk_eof, iterpool));
> SVN_ERR(svn_stream_readline(target->stream, &target_line, target->eol_str,
> &target->eof, iterpool));
> + /* If leading spaces has been dropped, the patch should still be
> + * applied. Started out as an issue with the mailing software
> + * tigris that for some reason ate extra spaces. */
Just say that UNIX patch accepts this, too. There's no need to bash
tigris in comments in our code. It's getting enough bashing elsewhere as is.
> + if (hunk_line->len == 0)
> + svn_stringbuf_appendcstr(hunk_line, " ");
> if (! hunk_eof && hunk_line->len > 0)
> char c = hunk_line->data;
> > Maybe libsvn_client should be shielded from the unidiff specifics
> > entirely, and just get original and modified hunk lines from libsvn_diff
> > with the extra '+', '-' etc. already removed.
> Involves changing svn_hunk_t to use some other classification than
> original_text and modified_text? Beeing lazy I would prefer not having
> to do that. :-)
It may involve changing that.
I was hoping you'd figure out what's needed :-P
I'd still prefer a solution where libsvn_client conceptually
ends up in a situation like:
original_text = "foo"
modified_text = "bar"
latest_text = "foo"
which would be much more easy to deal with instead of the current:
original_text = "-foo"
modified_text = "+bar"
latest_text = "foo"
Right now, the information about a pattern being original or modified
is encoded twice, both in the variable name and the first character
(except for latest_text which needs special-casing). libsvn_client
doesn't really care about the plusses and minusses. It does the text
string matching, not the parsing.
Say we'd want to add a parser for context diffs (diff -c) at some point.
That would be easier if libsvn_client was already shielded from the
specifics of the unidiff format.
> Now I am able to patch the simple example given in the issue description
> but when trying to create a file using this patch I get a conflict
> (There is one empty line at the end of the patch) :
> Index: foo2
> --- foo2 (revision 0)
> +++ foo2 (revision 0)
> @@ -0,0 +1 @@
> Is this an issue with the fuzz? If svn patch would allow at least one
> line of fuzz then this patch would work?
> Or do I need a way to find out when an empty line is there for
> readability (as the blank line before Property changes on ...) or when
> the leading space has been chopped?
There are two cases:
1) trailing empty line is part of the hunk's context
2) trailing empty line is part of the noise surrounding patches
in the unidiff
I guess we can assume that the number of lines of leading context
in a hunk is equal to the number of lines of trailing context in
a hunk. That should provide a way to resolve the ambiguity.
Received on 2009-08-07 14:24:16 CEST