On Thu, Oct 7, 2010 at 8:09 AM, Julian Foad <julian.foad_at_wandisco.com> wrote:
> Hi Paul.
>
> It's good to see a fix. I think, however, you could do this more simply
> and avoid some of the difficulties completely.
Julian,
Thanks as always for the review, comments inline.
> On Thu, 2010-09-30, pburba_at_apache.org wrote:
>> Author: pburba
>> Date: Thu Sep 30 20:21:19 2010
>> New Revision: 1003238
>>
>> URL: http://svn.apache.org/viewvc?rev=1003238&view=rev
>> Log:
>> Fix issue #3721 'redirection of svn propget output corrupted with large
>> property values'.
>
> And this change also fixes an EOL-style inconsistency, which was
> presumably happening consistently with property values of any size,
> doesn't it?
Yes.
> I think this change also introduces a new, similar, EOL-style
> inconsistency, in a different place; see last review comment at end of
> email.
You are right, there is a problem, though it might not be exactly what
you think:
If the property requires translation per svn_prop_needs_translation
(i.e. it's an svn:* prop) we will already have run the property value
through svn_subst_detranslate_string(), which gives us native line
endings. So in these cases the only non-native newline will be the
trailing newline after the propval:
/* Add an extra newline to the value before indenting, so that
* every line of output has the indentation whether the value
* already ended in a newline or not. */
const char *newval = apr_psprintf(pool, "%s\n", propval->data);
^^
If we return to the pre-r1003238 behavior of using printf, then
multiline svn:* prop values (e.g. svn:mergeinfo) are passed to printf
with native newlines and then printf tries (on my Windows box at any
rate) to convert those eols to native, resulting in redirected output
like this:
Properties on 'iota':<CR><LF>
propname<CR><LF>
subversion/branches/1.5.x:872364-874936<CR><CR><LF>
subversion/branches/1.5.x-34184:874657-874741<CR><CR><LF>
subversion/branches/1.5.x-34432:874744-874798<CR><CR><LF>
.
.
So avoiding printf's eol conversion by writing the the out stream
works better here.
> The difficulty with EOLs is that the native printf() family of functions
> typically converts to native EOLs, and the svn_stream_write() family
> doesn't, even when writing to svn_stream_for_stdout(). Therefore we
> have to be careful when mixing the two families.
>
>> This change makes all the writes to stdout, by svn pg, via the svn_stream_t *
>> that we already use to print the 'Properties on' header. Note that this
>> stream *is* attached to stdout, but avoiding the mix of stream writes and
>> printfs solves issue #3721 on Windows.
>
> It looks like the root of the corruption is also the mixing of native
> 'printf' functions with writes to a 'svn_stream_for_stdout()' stream.
> Do you agree that that's the case, and that therefore I should write a
> warning about this in the documentation of 'svn_stream_for_stdout()'?
I agree that is what the symptoms point to. I couldn't definitely
pinpoint the exact cause however.
> Perhaps not now, but we should consider whether we can eliminate the
> corruption by doing something to svn_stream_for_stdout() and/or the
> native 'stdout' stream. It might be possible by turning off some
> buffering that is happening, or something like that, or probably it
> could be done by implementing the stream_for_stdout as a stream class
> that forwards to the printf family of functions.
>
> What we probably should do now is avoid mixing printf with svn streams
> in this 'svn propget' code. From what I can see, there is little or no
> reason why svn_cl__propget() was using an svn stream for some of its
> printing at all. It looks like it would be much simpler to remove that
> stream (the variable called 'out') and just use printf-style functions
> throughout, as is done for 'proplist' and most of the rest of the UI
> (except for 'cat' and 'blame'). Note that we have svn_cmdline_printf()
> which is a handy encoding-converting variant of the native 'printf'
> family.
>
> Does that sound good?
Assuming there was never a reason to mix printfs with stream writes in
the first place, then yes, this would be a much simpler fix. As
mentioned above though, we will need to deal with printf's conversion
to native eols when the string it's printing already has them.
Paul
> If so, then all the comments below become moot.
> (I wrote some of them before I figured all this out, and they may still
> be relevant if I'm wrong about some of this.)
>
>
>> * subversion/svn/cl.h
>>
>> (svn_cl__print_prop_hash): Support printing to a passed in svn_stream_t *.
>>
>> * subversion/svn/propget-cmd.c
>>
>> (print_properties): Make the header's line endings native. Print the
>> property names and values to the same svn_stream_t * we already use
>> print the headers to.
>>
>> * subversion/svn/proplist-cmd.c
>>
>> (proplist_receiver, svn_cl__proplist): Update calls to
>> svn_cl__print_prop_hash(), keeping old behavior.
>>
>> * subversion/svn/props.c
>>
>> (svn_cl__print_prop_hash): Support printing to a passed in svn_stream_t *,
>> otherwise fall-back to old behavior of using printf().
>>
>>
>> * subversion/tests/cmdline/prop_tests.py
>>
>> (propget_redirection): Remove comment re failure status.
>>
>> (test_list): Remove XFail.
>
>
>> Modified: subversion/trunk/subversion/svn/cl.h
>> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/cl.h?rev=1003238&r1=1003237&r2=1003238&view=diff
>> ==============================================================================
>> --- subversion/trunk/subversion/svn/cl.h (original)
>> +++ subversion/trunk/subversion/svn/cl.h Thu Sep 30 20:21:19 2010
>> @@ -417,15 +417,18 @@ svn_cl__print_status_xml(const char *pat
>> apr_pool_t *pool);
>>
>>
>> -/* Print a hash that maps property names (char *) to property values
>> - (svn_string_t *). The names are assumed to be in UTF-8 format;
>> +/* Print to stdout a hash that maps property names (char *) to property
>> + values (svn_string_t *). The names are assumed to be in UTF-8 format;
>> the values are either in UTF-8 (the special Subversion props) or
>> plain binary values.
>>
>> + If OUT is not NULL, then write to it rather than stdout.
>
> This could do with pointing out that the two modes should not be mixed.
>
> This seems like unnecessary complexity, allowing the caller to choose
> whether to write to the native stdout stream or write to a supplied
> stream which in practice (at present) is always connected to stdout.
> Just support one or the other and make both callers consistent.
>
>> +
>> If NAMES_ONLY is true, print just names, else print names and
>> values. */
>> svn_error_t *
>> -svn_cl__print_prop_hash(apr_hash_t *prop_hash,
>> +svn_cl__print_prop_hash(svn_stream_t *out,
>> + apr_hash_t *prop_hash,
>> svn_boolean_t names_only,
>> apr_pool_t *pool);
>>
>>
>> Modified: subversion/trunk/subversion/svn/propget-cmd.c
>> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/propget-cmd.c?rev=1003238&r1=1003237&r2=1003238&view=diff
>> ==============================================================================
>> --- subversion/trunk/subversion/svn/propget-cmd.c (original)
>> +++ subversion/trunk/subversion/svn/propget-cmd.c Thu Sep 30 20:21:19 2010
>> @@ -140,6 +140,12 @@ print_properties(svn_stream_t *out,
>> ? _("Properties on '%s':\n")
>> : "%s - ", filename);
>> SVN_ERR(svn_cmdline_cstring_from_utf8(&header, header, iterpool));
>> + SVN_ERR(svn_subst_translate_cstring2(header, &header,
>> + APR_EOL_STR, /* 'native' eol */
>> + FALSE, /* no repair */
>> + NULL, /* no keywords */
>> + FALSE, /* no expansion */
>> + iterpool));
>
> Strictly speaking, the newline conversion should come before the
> translation from UTF8 to native encoding, in case a newline sequence is
> encoded differently in the native encoding. (I guess we probably don't
> support such encodings. If we do, there are other places where we do it
> wrongly.)
>
> That's a lot of source code just to print a string in native style.
> Four statements in total. Can't we write and use a dedicated function
> or a dedicated stream class that does this?
>
> Or write
>
> ? _("Properties on '%s':" APR_EOL_STR)
>
> and omit the translate_cstring.
>
>
>> SVN_ERR(stream_write(out, header, strlen(header)));
>> }
>>
>> @@ -149,7 +155,7 @@ print_properties(svn_stream_t *out,
>> apr_hash_t *hash = apr_hash_make(iterpool);
>>
>> apr_hash_set(hash, pname_utf8, APR_HASH_KEY_STRING, propval);
>> - svn_cl__print_prop_hash(hash, FALSE, iterpool);
>> + svn_cl__print_prop_hash(out, hash, FALSE, iterpool);
>> }
>> else
>> {
>>
>> Modified: subversion/trunk/subversion/svn/proplist-cmd.c
>> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/proplist-cmd.c?rev=1003238&r1=1003237&r2=1003238&view=diff
>> ==============================================================================
>> --- subversion/trunk/subversion/svn/proplist-cmd.c (original)
>> +++ subversion/trunk/subversion/svn/proplist-cmd.c Thu Sep 30 20:21:19 2010
>> @@ -98,7 +98,8 @@ proplist_receiver(void *baton,
>>
>> if (!opt_state->quiet)
>> SVN_ERR(svn_cmdline_printf(pool, _("Properties on '%s':\n"), name_local));
>> - return svn_cl__print_prop_hash(prop_hash, (! opt_state->verbose), pool);
>> + return svn_cl__print_prop_hash(NULL, prop_hash, (! opt_state->verbose),
>> + pool);
>> }
>>
>>
>> @@ -159,7 +160,7 @@ svn_cl__proplist(apr_getopt_t *os,
>> rev));
>>
>> SVN_ERR(svn_cl__print_prop_hash
>> - (proplist, (! opt_state->verbose), scratch_pool));
>> + (NULL, proplist, (! opt_state->verbose), scratch_pool));
>> }
>> }
>> else /* operate on normal, versioned properties (not revprops) */
>>
>> Modified: subversion/trunk/subversion/svn/props.c
>> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/props.c?rev=1003238&r1=1003237&r2=1003238&view=diff
>> ==============================================================================
>> --- subversion/trunk/subversion/svn/props.c (original)
>> +++ subversion/trunk/subversion/svn/props.c Thu Sep 30 20:21:19 2010
>> @@ -82,7 +82,8 @@ svn_cl__revprop_prepare(const svn_opt_re
>>
>>
>> svn_error_t *
>> -svn_cl__print_prop_hash(apr_hash_t *prop_hash,
>> +svn_cl__print_prop_hash(svn_stream_t *out,
>> + apr_hash_t *prop_hash,
>> svn_boolean_t names_only,
>> apr_pool_t *pool)
>> {
>> @@ -93,6 +94,7 @@ svn_cl__print_prop_hash(apr_hash_t *prop
>> const char *pname = svn__apr_hash_index_key(hi);
>> svn_string_t *propval = svn__apr_hash_index_val(hi);
>> const char *pname_stdout;
>> + apr_size_t len;
>>
>> if (svn_prop_needs_translation(pname))
>> SVN_ERR(svn_subst_detranslate_string(&propval, propval,
>> @@ -100,18 +102,45 @@ svn_cl__print_prop_hash(apr_hash_t *prop
>>
>> SVN_ERR(svn_cmdline_cstring_from_utf8(&pname_stdout, pname, pool));
>>
>> - /* ### We leave these printfs for now, since if propval wasn't translated
>> - * above, we don't know anything about its encoding. In fact, it
>> - * might be binary data... */
>> - printf(" %s\n", pname_stdout);
>
> So this 'printf' used to translate to native EOLs...
>
>> + if (out)
>> + {
>> + pname_stdout = apr_psprintf(pool, " %s\n", pname_stdout);
>> + SVN_ERR(svn_subst_translate_cstring2(pname_stdout, &pname_stdout,
>> + APR_EOL_STR, /* 'native' eol */
>> + FALSE, /* no repair */
>> + NULL, /* no keywords */
>> + FALSE, /* no expansion */
>> + pool));
>> +
>> + len = strlen(pname_stdout);
>> + SVN_ERR(svn_stream_write(out, pname_stdout, &len));
>
> ... whereas svn_stream_write() doesn't, so you had to insert a manual
> translation in order to preserve the functionality while switching to
> stream functions. (Unlike in propget-cmd.c above, here we don't convert
> from UTF8 because we don't know whether it is UTF8.)
>
> Same comments as above: maybe use SVN_EOL_STR or a dedicated printing
> function, instead of the translation function. (Neither a file name nor
> a prop name is allowed to contain a newline.)
>
>> + }
>> + else
>> + {
>> + /* ### We leave these printfs for now, since if propval wasn't
>> + translated above, we don't know anything about its encoding.
>> + In fact, it might be binary data... */
>> + printf(" %s\n", pname_stdout);
>
> The comment that said "We leave these printfs for now ..." referred to
> the fact that we aren't attempting to translate the character encoding
> to the native one, and that comment should apply to both branches of
> this new if/else.
>
> The comment applicable to this branch is "If OUT is null we explicitly
> use the native printf family, not an svn_stream."
>
>> + }
>> +
>> if (!names_only)
>> {
>> /* Add an extra newline to the value before indenting, so that
>> * every line of output has the indentation whether the value
>> * already ended in a newline or not. */
>> const char *newval = apr_psprintf(pool, "%s\n", propval->data);
>> -
>> - printf("%s", svn_cl__indent_string(newval, " ", pool));
>> + const char *indented_newval = svn_cl__indent_string(newval,
>> + " ",
>> + pool);
>> + if (out)
>> + {
>> + len = strlen(indented_newval);
>> + SVN_ERR(svn_stream_write(out, indented_newval, &len));
>
> Bug? You lost the conversion to native newlines that 'printf' provided,
> and haven't replaced it.
>
>
>> + }
>> + else
>> + {
>> + printf("%s", indented_newval);
>> + }
>> }
>> }
> [...]
>
>
> - Julian
>
>
>
Received on 2010-10-07 18:11:40 CEST