[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] issue 3459 - svn patch does not tolerate empty lines

From: Stefan Sperling <stsp_at_elego.de>
Date: Fri, 14 Aug 2009 07:38:41 +0100

On Fri, Aug 14, 2009 at 07:27:23AM +0200, Daniel Näslund wrote:
> Ping! This patch hasn't been reviewed.

Reviewed now, see below.

> > [[[
> > Fix issue 3459 - svn patch does not tolerate empty lines.
> >
> > * subversion/libsvn_diff/parse-diff.c
> > (set_hunk_text): Removes leading character before setting hunk text.
> > (svn_diff__parse_next_hunk): Allow empty lines in hunk.
> >
> > * subversion/tests/libsvn_diff/parse-diff-test.c
> > (test_parse_unidiff): Fix changed representation of hunks.
> >
> > * subversion/libsvn_client/patch.c
> > (match_hunk, copy_hunk_text): Fix changed representation of hunks.
> > ]]]
> >
> > Mvh
> > Daniel
> >
>
> > Index: subversion/libsvn_diff/parse-diff.c
> > ===================================================================
> > --- subversion/libsvn_diff/parse-diff.c (revision 38646)
> > +++ subversion/libsvn_diff/parse-diff.c (arbetskopia)
> > @@ -30,6 +30,7 @@
> > #include "svn_pools.h"
> > #include "svn_utf.h"
> > #include "svn_dirent_uri.h"
> > +#include "svn_cmdline.h"
> >
> > #include "private/svn_diff_private.h"
> >
> > @@ -273,6 +274,42 @@
> > return TRUE;
> > }
> >
> > +static svn_error_t *
> > +set_hunk_text(svn_stream_t **hunk_text, svn_stream_t **text,

Why isn't text just a pointer, ie. *text instead of **text?
You're not using it as an output parameter.

> > + apr_pool_t *pool)
> > +{
> > + int offset;
> > + svn_stringbuf_t *line, *buf;
> > + svn_boolean_t eof;
> > + char *c_line;
> > +
> > + buf = svn_stringbuf_create("", pool);
> > +
> > + do
> > + {
> > + SVN_ERR(svn_stream_readline(*text, &line, APR_EOL_STR, &eof, pool));

What if the patch and target file use an eol-style other than
APR_EOL_STR? We need to support that.

> > +
> > + char c = line->data[0];
> > +
> > + if ( c == ' ' || c == '-' || c == '+')
> > + offset = 1;
> > + else
> > + offset = 0;
> > +
> > + if (! eof)
> > + c_line = apr_psprintf(pool, "%s%s", line->data + offset, APR_EOL_STR);

Same here.

> > + else
> > + c_line = apr_psprintf(pool, "%s", line->data + offset);
> > +
> > + svn_stringbuf_appendcstr(buf, c_line);

I went to great length to avoid sucking an entire hunk into memory,
but this change would make us suck an entire hunk into memory again.

A hunk may be very large, e.g. when adding a large file.
The stream 'text' that comes into this function is backed by a range
of the patch file instead of having its content allocated in a pool.
This way, 'svn patch' will only allocate memory for one line at a time
instead of one hunk at a time. (Someone could still pass us a 4GB long
line and crash 'svn patch', we might want to add a sanity check for
that condition before release -- but that's for another patch :)

Instead of putting the hunk into memory, you could try wrapping the
text stream in another stream that shaves leading characters from
every line read (in svn_stream_readline()). That stream could then
be passed to libsvn_client. Such a stream does not yet exist so you
need to create it. See include/svn_io.h for more information about
streams, and see the existing examples in libsvn_subr/stream.c.

> > +
> > + } while (! eof);
> > +
> > + *hunk_text = svn_stream_from_stringbuf(buf, pool);
> > +
> > + return SVN_NO_ERROR;
> > +}
> > +
> > /* A stream line-filter which allows only original text from a hunk. */
> > static svn_error_t *
> > original_line_filter(svn_boolean_t *filtered, const char *line,
> > @@ -305,6 +342,7 @@
> > svn_stream_t *diff_text;
> > svn_stream_t *original_text;
> > svn_stream_t *modified_text;
> > + svn_linenum_t original_lines;
> > svn_stream_t *s;
> > apr_pool_t *iterpool;
> >
> > @@ -358,8 +396,22 @@
> > }
> >
> > c = line->data[0];
> > - if (c == ' ' || c == '-' || c == '+')
> > - hunk_seen = TRUE;
> > +
> > + if (c == ' ' || c == '-')
> > + {
> > + hunk_seen = TRUE;
> > + original_lines--;

What happens if this wraps around? svn_linenum_t is unsigned.

> > + }
> > + else if (c == '+')
> > + {
> > + hunk_seen = TRUE;
> > + }
> > + /* Should tolerate chopped leading spaces on empty lines. */
> > + else if (original_lines > 0 && ! eof && line->len == 0)
> > + {
> > + hunk_seen = TRUE;
> > + original_lines--;

This can't wrap around, good.

> > + }
> > else
> > {
> > in_hunk = FALSE;
> > @@ -376,6 +428,10 @@
> > if (starts_with(line->data, atat))
> > /* Looks like we have a hunk header, let's try to rip it apart. */
> > in_hunk = parse_hunk_header(line->data, *hunk, iterpool);
> > +
> > + if (in_hunk)
> > + original_lines = (*hunk)->original_length;
> > +
> > else if (starts_with(line->data, minus))
> > /* This could be a header of another patch. Bail out. */
> > break;
> > @@ -421,9 +477,13 @@
> > svn_stream_set_line_filter_callback(modified_text, modified_line_filter);
> >
> > /* Set the hunk's texts. */
> > - (*hunk)->diff_text = diff_text;
> > - (*hunk)->original_text = original_text;
> > - (*hunk)->modified_text = modified_text;
> > + SVN_ERR(set_hunk_text(&(*hunk)->diff_text, &diff_text,
> > + result_pool));
> > + SVN_ERR(set_hunk_text(&(*hunk)->original_text, &original_text,
> > + result_pool));
> > + SVN_ERR(set_hunk_text(&(*hunk)->modified_text, &modified_text,
> > + result_pool));
> > +
> > }
> > else
> > /* Something went wrong, just discard the result. */

The test adjustments below look good, that's exactly how we want it
to behave!

Thanks,
Stefan

> > Index: subversion/tests/libsvn_diff/parse-diff-test.c
> > ===================================================================
> > --- subversion/tests/libsvn_diff/parse-diff-test.c (revision 38646)
> > +++ subversion/tests/libsvn_diff/parse-diff-test.c (arbetskopia)
> > @@ -46,7 +46,7 @@
> > "===================================================================" NL
> > "--- A/D/gamma.orig" NL
> > "+++ A/D/gamma" NL
> > - "@@ -1 +1,2 @@" NL
> > + "@@ -1,2 +1 @@" NL
> > " This is the file 'gamma'." NL
> > "-some less bytes to 'gamma'" NL
> > "" NL
> > @@ -99,7 +99,7 @@
> > /* Make sure original text was parsed correctly. */
> > SVN_ERR(svn_stream_readline(hunk->original_text, &buf, NL, &eof, pool));
> > SVN_ERR_ASSERT(! eof);
> > - SVN_ERR_ASSERT(! strcmp(buf->data, " This is the file 'gamma'."));
> > + SVN_ERR_ASSERT(! strcmp(buf->data, "This is the file 'gamma'."));
> > /* Now we should get EOF. */
> > SVN_ERR(svn_stream_readline(hunk->original_text, &buf, NL, &eof, pool));
> > SVN_ERR_ASSERT(eof);
> > @@ -108,10 +108,10 @@
> > /* Make sure modified text was parsed correctly. */
> > SVN_ERR(svn_stream_readline(hunk->modified_text, &buf, NL, &eof, pool));
> > SVN_ERR_ASSERT(! eof);
> > - SVN_ERR_ASSERT(! strcmp(buf->data, " This is the file 'gamma'."));
> > + SVN_ERR_ASSERT(! strcmp(buf->data, "This is the file 'gamma'."));
> > SVN_ERR(svn_stream_readline(hunk->modified_text, &buf, NL, &eof, pool));
> > SVN_ERR_ASSERT(! eof);
> > - SVN_ERR_ASSERT(! strcmp(buf->data, "+some more bytes to 'gamma'"));
> > + SVN_ERR_ASSERT(! strcmp(buf->data, "some more bytes to 'gamma'"));
> > /* Now we should get EOF. */
> > SVN_ERR(svn_stream_readline(hunk->modified_text, &buf, NL, &eof, pool));
> > SVN_ERR_ASSERT(eof);
> > @@ -128,10 +128,10 @@
> > /* Make sure original text was parsed correctly. */
> > SVN_ERR(svn_stream_readline(hunk->original_text, &buf, NL, &eof, pool));
> > SVN_ERR_ASSERT(! eof);
> > - SVN_ERR_ASSERT(! strcmp(buf->data, " This is the file 'gamma'."));
> > + SVN_ERR_ASSERT(! strcmp(buf->data, "This is the file 'gamma'."));
> > SVN_ERR(svn_stream_readline(hunk->original_text, &buf, NL, &eof, pool));
> > SVN_ERR_ASSERT(! eof);
> > - SVN_ERR_ASSERT(! strcmp(buf->data, "-some less bytes to 'gamma'"));
> > + SVN_ERR_ASSERT(! strcmp(buf->data, "some less bytes to 'gamma'"));
> > /* Now we should get EOF. */
> > SVN_ERR(svn_stream_readline(hunk->original_text, &buf, NL, &eof, pool));
> > SVN_ERR_ASSERT(eof);
> > @@ -140,7 +140,7 @@
> > /* Make sure modified text was parsed correctly. */
> > SVN_ERR(svn_stream_readline(hunk->modified_text, &buf, NL, &eof, pool));
> > SVN_ERR_ASSERT(! eof);
> > - SVN_ERR_ASSERT(! strcmp(buf->data, " This is the file 'gamma'."));
> > + SVN_ERR_ASSERT(! strcmp(buf->data, "This is the file 'gamma'."));
> > /* Now we should get EOF. */
> > SVN_ERR(svn_stream_readline(hunk->modified_text, &buf, NL, &eof, pool));
> > SVN_ERR_ASSERT(eof);
> > Index: subversion/libsvn_client/patch.c
> > ===================================================================
> > --- subversion/libsvn_client/patch.c (revision 38646)
> > +++ subversion/libsvn_client/patch.c (arbetskopia)
> > @@ -2108,12 +2108,11 @@
> > target->patch->eol_str, &hunk_eof, iterpool));
> > SVN_ERR(svn_stream_readline(target->stream, &target_line, target->eol_str,
> > &target->eof, iterpool));
> > +
> > if (! hunk_eof && hunk_line->len > 0)
> > {
> > - char c = hunk_line->data[0];
> > - SVN_ERR_ASSERT(c == ' ' || c == '-');
> > - lines_matched = (hunk_line->len == target_line->len + 1 &&
> > - ! strcmp(hunk_line->data + 1, target_line->data));
> > + lines_matched = (hunk_line->len == target_line->len &&
> > + ! strcmp(hunk_line->data, target_line->data));
> > }
> > }
> > while (lines_matched && ! (hunk_eof || target->eof));
> > @@ -2386,13 +2385,9 @@
> > {
> > if (line->len >= 1)
> > {
> > - char c = line->data[0];
> > -
> > - SVN_ERR_ASSERT(c == ' ' || c == '+' || c == '-');
> > - len = line->len - 1;
> > - SVN_ERR(svn_io_file_write_full(file, line->data + 1, len, &len,
> > + len = line->len;
> > + SVN_ERR(svn_io_file_write_full(file, line->data, len, &len,
> > iterpool));
> > - SVN_ERR_ASSERT(len == line->len - 1);
> > }
> >
> > /* Add newline. */
>
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383529

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2383544
Received on 2009-08-14 08:39:16 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.