[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1003238 - in /subversion/trunk/subversion: svn/cl.h svn/propget-cmd.c svn/proplist-cmd.c svn/props.c tests/cmdline/prop_tests.py

From: Paul Burba <ptburba_at_gmail.com>
Date: Thu, 7 Oct 2010 12:11:01 -0400

On Thu, Oct 7, 2010 at 8:09 AM, Julian Foad <julian.foad_at_wandisco.com> wrote:
> Hi Paul.
>
> It's good to see a fix.  I think, however, you could do this more simply
> and avoid some of the difficulties completely.

Julian,

Thanks as always for the review, comments inline.

> On Thu, 2010-09-30, pburba_at_apache.org wrote:
>> Author: pburba
>> Date: Thu Sep 30 20:21:19 2010
>> New Revision: 1003238
>>
>> URL: http://svn.apache.org/viewvc?rev=1003238&view=rev
>> Log:
>> Fix issue #3721 'redirection of svn propget output corrupted with large
>> property values'.
>
> And this change also fixes an EOL-style inconsistency, which was
> presumably happening consistently with property values of any size,
> doesn't it?

Yes.

> I think this change also introduces a new, similar, EOL-style
> inconsistency, in a different place; see last review comment at end of
> email.

You are right, there is a problem, though it might not be exactly what
you think:

If the property requires translation per svn_prop_needs_translation
(i.e. it's an svn:* prop) we will already have run the property value
through svn_subst_detranslate_string(), which gives us native line
endings. So in these cases the only non-native newline will be the
trailing newline after the propval:

          /* Add an extra newline to the value before indenting, so that
           * every line of output has the indentation whether the value
           * already ended in a newline or not. */
          const char *newval = apr_psprintf(pool, "%s\n", propval->data);

        ^^
If we return to the pre-r1003238 behavior of using printf, then
multiline svn:* prop values (e.g. svn:mergeinfo) are passed to printf
with native newlines and then printf tries (on my Windows box at any
rate) to convert those eols to native, resulting in redirected output
like this:

Properties on 'iota':<CR><LF>
  propname<CR><LF>
    subversion/branches/1.5.x:872364-874936<CR><CR><LF>
    subversion/branches/1.5.x-34184:874657-874741<CR><CR><LF>
    subversion/branches/1.5.x-34432:874744-874798<CR><CR><LF>
    .
    .

So avoiding printf's eol conversion by writing the the out stream
works better here.

> The difficulty with EOLs is that the native printf() family of functions
> typically converts to native EOLs, and the svn_stream_write() family
> doesn't, even when writing to svn_stream_for_stdout().  Therefore we
> have to be careful when mixing the two families.
>
>> This change makes all the writes to stdout, by svn pg, via the svn_stream_t *
>> that we already use to print the 'Properties on' header.  Note that this
>> stream *is* attached to stdout, but avoiding the mix of stream writes and
>> printfs solves issue #3721 on Windows.
>
> It looks like the root of the corruption is also the mixing of native
> 'printf' functions with writes to a 'svn_stream_for_stdout()' stream.
> Do you agree that that's the case, and that therefore I should write a
> warning about this in the documentation of 'svn_stream_for_stdout()'?

I agree that is what the symptoms point to. I couldn't definitely
pinpoint the exact cause however.

> Perhaps not now, but we should consider whether we can eliminate the
> corruption by doing something to svn_stream_for_stdout() and/or the
> native 'stdout' stream.  It might be possible by turning off some
> buffering that is happening, or something like that, or probably it
> could be done by implementing the stream_for_stdout as a stream class
> that forwards to the printf family of functions.
>
> What we probably should do now is avoid mixing printf with svn streams
> in this 'svn propget' code.  From what I can see, there is little or no
> reason why svn_cl__propget() was using an svn stream for some of its
> printing at all.  It looks like it would be much simpler to remove that
> stream (the variable called 'out') and just use printf-style functions
> throughout, as is done for 'proplist' and most of the rest of the UI
> (except for 'cat' and 'blame').  Note that we have svn_cmdline_printf()
> which is a handy encoding-converting variant of the native 'printf'
> family.
>
> Does that sound good?

Assuming there was never a reason to mix printfs with stream writes in
the first place, then yes, this would be a much simpler fix. As
mentioned above though, we will need to deal with printf's conversion
to native eols when the string it's printing already has them.

Paul

> If so, then all the comments below become moot.
> (I wrote some of them before I figured all this out, and they may still
> be relevant if I'm wrong about some of this.)
>
>
>> * subversion/svn/cl.h
>>
>>   (svn_cl__print_prop_hash): Support printing to a passed in svn_stream_t *.
>>
>> * subversion/svn/propget-cmd.c
>>
>>   (print_properties): Make the header's line endings native.  Print the
>>    property names and values to the same svn_stream_t * we already use
>>    print the headers to.
>>
>> * subversion/svn/proplist-cmd.c
>>
>>   (proplist_receiver, svn_cl__proplist): Update calls to
>>    svn_cl__print_prop_hash(), keeping old behavior.
>>
>> * subversion/svn/props.c
>>
>>   (svn_cl__print_prop_hash): Support printing to a passed in svn_stream_t *,
>>    otherwise fall-back to old behavior of using printf().
>>
>>
>> * subversion/tests/cmdline/prop_tests.py
>>
>>   (propget_redirection): Remove comment re failure status.
>>
>>   (test_list): Remove XFail.
>
>
>> Modified: subversion/trunk/subversion/svn/cl.h
>> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/cl.h?rev=1003238&r1=1003237&r2=1003238&view=diff
>> ==============================================================================
>> --- subversion/trunk/subversion/svn/cl.h (original)
>> +++ subversion/trunk/subversion/svn/cl.h Thu Sep 30 20:21:19 2010
>> @@ -417,15 +417,18 @@ svn_cl__print_status_xml(const char *pat
>>                           apr_pool_t *pool);
>>
>>
>> -/* Print a hash that maps property names (char *) to property values
>> -   (svn_string_t *).  The names are assumed to be in UTF-8 format;
>> +/* Print to stdout a hash that maps property names (char *) to property
>> +   values (svn_string_t *).  The names are assumed to be in UTF-8 format;
>>     the values are either in UTF-8 (the special Subversion props) or
>>     plain binary values.
>>
>> +   If OUT is not NULL, then write to it rather than stdout.
>
> This could do with pointing out that the two modes should not be mixed.
>
> This seems like unnecessary complexity, allowing the caller to choose
> whether to write to the native stdout stream or write to a supplied
> stream which in practice (at present) is always connected to stdout.
> Just support one or the other and make both callers consistent.
>
>> +
>>     If NAMES_ONLY is true, print just names, else print names and
>>     values. */
>>  svn_error_t *
>> -svn_cl__print_prop_hash(apr_hash_t *prop_hash,
>> +svn_cl__print_prop_hash(svn_stream_t *out,
>> +                        apr_hash_t *prop_hash,
>>                          svn_boolean_t names_only,
>>                          apr_pool_t *pool);
>>
>>
>> Modified: subversion/trunk/subversion/svn/propget-cmd.c
>> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/propget-cmd.c?rev=1003238&r1=1003237&r2=1003238&view=diff
>> ==============================================================================
>> --- subversion/trunk/subversion/svn/propget-cmd.c (original)
>> +++ subversion/trunk/subversion/svn/propget-cmd.c Thu Sep 30 20:21:19 2010
>> @@ -140,6 +140,12 @@ print_properties(svn_stream_t *out,
>>                                  ? _("Properties on '%s':\n")
>>                                  : "%s - ", filename);
>>            SVN_ERR(svn_cmdline_cstring_from_utf8(&header, header, iterpool));
>> +          SVN_ERR(svn_subst_translate_cstring2(header, &header,
>> +                                               APR_EOL_STR,  /* 'native' eol */
>> +                                               FALSE, /* no repair */
>> +                                               NULL,  /* no keywords */
>> +                                               FALSE, /* no expansion */
>> +                                               iterpool));
>
> Strictly speaking, the newline conversion should come before the
> translation from UTF8 to native encoding, in case a newline sequence is
> encoded differently in the native encoding.  (I guess we probably don't
> support such encodings.  If we do, there are other places where we do it
> wrongly.)
>
> That's a lot of source code just to print a string in native style.
> Four statements in total.  Can't we write and use a dedicated function
> or a dedicated stream class that does this?
>
> Or write
>
>  ? _("Properties on '%s':" APR_EOL_STR)
>
> and omit the translate_cstring.
>
>
>>            SVN_ERR(stream_write(out, header, strlen(header)));
>>          }
>>
>> @@ -149,7 +155,7 @@ print_properties(svn_stream_t *out,
>>            apr_hash_t *hash = apr_hash_make(iterpool);
>>
>>            apr_hash_set(hash, pname_utf8, APR_HASH_KEY_STRING, propval);
>> -          svn_cl__print_prop_hash(hash, FALSE, iterpool);
>> +          svn_cl__print_prop_hash(out, hash, FALSE, iterpool);
>>          }
>>        else
>>          {
>>
>> Modified: subversion/trunk/subversion/svn/proplist-cmd.c
>> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/proplist-cmd.c?rev=1003238&r1=1003237&r2=1003238&view=diff
>> ==============================================================================
>> --- subversion/trunk/subversion/svn/proplist-cmd.c (original)
>> +++ subversion/trunk/subversion/svn/proplist-cmd.c Thu Sep 30 20:21:19 2010
>> @@ -98,7 +98,8 @@ proplist_receiver(void *baton,
>>
>>    if (!opt_state->quiet)
>>      SVN_ERR(svn_cmdline_printf(pool, _("Properties on '%s':\n"), name_local));
>> -  return svn_cl__print_prop_hash(prop_hash, (! opt_state->verbose), pool);
>> +  return svn_cl__print_prop_hash(NULL, prop_hash, (! opt_state->verbose),
>> +                                 pool);
>>  }
>>
>>
>> @@ -159,7 +160,7 @@ svn_cl__proplist(apr_getopt_t *os,
>>                                  rev));
>>
>>            SVN_ERR(svn_cl__print_prop_hash
>> -                  (proplist, (! opt_state->verbose), scratch_pool));
>> +                  (NULL, proplist, (! opt_state->verbose), scratch_pool));
>>          }
>>      }
>>    else  /* operate on normal, versioned properties (not revprops) */
>>
>> Modified: subversion/trunk/subversion/svn/props.c
>> URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/svn/props.c?rev=1003238&r1=1003237&r2=1003238&view=diff
>> ==============================================================================
>> --- subversion/trunk/subversion/svn/props.c (original)
>> +++ subversion/trunk/subversion/svn/props.c Thu Sep 30 20:21:19 2010
>> @@ -82,7 +82,8 @@ svn_cl__revprop_prepare(const svn_opt_re
>>
>>
>>  svn_error_t *
>> -svn_cl__print_prop_hash(apr_hash_t *prop_hash,
>> +svn_cl__print_prop_hash(svn_stream_t *out,
>> +                        apr_hash_t *prop_hash,
>>                          svn_boolean_t names_only,
>>                          apr_pool_t *pool)
>>  {
>> @@ -93,6 +94,7 @@ svn_cl__print_prop_hash(apr_hash_t *prop
>>        const char *pname = svn__apr_hash_index_key(hi);
>>        svn_string_t *propval = svn__apr_hash_index_val(hi);
>>        const char *pname_stdout;
>> +      apr_size_t len;
>>
>>        if (svn_prop_needs_translation(pname))
>>          SVN_ERR(svn_subst_detranslate_string(&propval, propval,
>> @@ -100,18 +102,45 @@ svn_cl__print_prop_hash(apr_hash_t *prop
>>
>>        SVN_ERR(svn_cmdline_cstring_from_utf8(&pname_stdout, pname, pool));
>>
>> -      /* ### We leave these printfs for now, since if propval wasn't translated
>> -       * above, we don't know anything about its encoding.  In fact, it
>> -       * might be binary data... */
>> -      printf("  %s\n", pname_stdout);
>
> So this 'printf' used to translate to native EOLs...
>
>> +      if (out)
>> +        {
>> +          pname_stdout = apr_psprintf(pool, "  %s\n", pname_stdout);
>> +          SVN_ERR(svn_subst_translate_cstring2(pname_stdout, &pname_stdout,
>> +                                              APR_EOL_STR,  /* 'native' eol */
>> +                                              FALSE, /* no repair */
>> +                                              NULL,  /* no keywords */
>> +                                              FALSE, /* no expansion */
>> +                                              pool));
>> +
>> +          len = strlen(pname_stdout);
>> +          SVN_ERR(svn_stream_write(out, pname_stdout, &len));
>
> ... whereas svn_stream_write() doesn't, so you had to insert a manual
> translation in order to preserve the functionality while switching to
> stream functions.  (Unlike in propget-cmd.c above, here we don't convert
> from UTF8 because we don't know whether it is UTF8.)
>
> Same comments as above: maybe use SVN_EOL_STR or a dedicated printing
> function, instead of the translation function.  (Neither a file name nor
> a prop name is allowed to contain a newline.)
>
>> +        }
>> +      else
>> +        {
>> +          /* ### We leave these printfs for now, since if propval wasn't
>> +             translated above, we don't know anything about its encoding.
>> +             In fact, it might be binary data... */
>> +          printf("  %s\n", pname_stdout);
>
> The comment that said "We leave these printfs for now ..." referred to
> the fact that we aren't attempting to translate the character encoding
> to the native one, and that comment should apply to both branches of
> this new if/else.
>
> The comment applicable to this branch is "If OUT is null we explicitly
> use the native printf family, not an svn_stream."
>
>> +        }
>> +
>>        if (!names_only)
>>          {
>>            /* Add an extra newline to the value before indenting, so that
>>             * every line of output has the indentation whether the value
>>             * already ended in a newline or not. */
>>            const char *newval = apr_psprintf(pool, "%s\n", propval->data);
>> -
>> -          printf("%s", svn_cl__indent_string(newval, "    ", pool));
>> +          const char *indented_newval = svn_cl__indent_string(newval,
>> +                                                              "    ",
>> +                                                              pool);
>> +          if (out)
>> +            {
>> +              len = strlen(indented_newval);
>> +              SVN_ERR(svn_stream_write(out, indented_newval, &len));
>
> Bug?  You lost the conversion to native newlines that 'printf' provided,
> and haven't replaced it.
>
>
>> +            }
>> +          else
>> +            {
>> +              printf("%s", indented_newval);
>> +            }
>>          }
>>      }
> [...]
>
>
> - Julian
>
>
>
Received on 2010-10-07 18:11:40 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.