Julian Foad wrote on 01/26/2005 08:37 AM:
> VK Sameer wrote:
>
> And here's my rather picky review. :-)
No problem. :)
> > ------------------------------------------------------------------------
> >
> > Fixed issue #2147 - control characters in log message cause failure
> (over DAV)
> >
> > * subversion/mod_dav_svn/log.c: modified log_receiver to call
> > svn_xml_fuzzy_escape()
>
> But what is the intention or result of this change? In other words, in
> what way does calling that new function affect what this function does?
> For example:
>
> * subversion/mod_dav_svn/log.c (log_receiver): call svn_xml_fuzzy_escape
> to ensure no control characters remain in the log message.
Fixed
> You should name in parentheses each top-level program object (function,
> variable, etc.) that is modified/added/removed:
OK.
>> Index: subversion/include/svn_xml.h
>> +/** Return @c string on finding no ASCII control characters. Otherwise
>> + * return a copy created with @c pool, with control characters
>
>
> "@a string" and "@a pool" - they are Arguments rather than
> Cross-references (that's how I think of the "@a" and "@c" mnemonics,
> even if that's not the formal definition).
My bad - fixed.
>> + * replaced by "0x\d+" sequences.
>
>
> Presumably hex digits, not "\d".
Yup, confused me in the .c file as well. Fixed.
>> + */
>> +const char* svn_xml_fuzzy_escape (const char *string,
>
>
> Put space before "*", not after it.
OK.
> Why does this return "const char *" when all the other
> svn_xml_escape_... functions return void and append their output to a
> stringbuf? (Not saying you necessarily should change it.)
So that it can be chained in the caller. The intent was to match the
existing call to apr_cml_quote_string.
>> /* Generalized Subversion XML Parsing */
>> Index: subversion/libsvn_subr/xml.c
>> ===================================================================
>> --- subversion/libsvn_subr/xml.c (revision 12853)
>> +++ subversion/libsvn_subr/xml.c (working copy)
>> @@ -25,6 +25,7 @@
>> #include "svn_pools.h"
>> #include "svn_xml.h"
>> #include "svn_error.h"
>> +#include "svn_ctype.h"
>>
>> #ifdef SVN_HAVE_OLD_EXPAT
>> #include "xmlparse.h"
>> @@ -265,7 +266,47 @@
>> xml_escape_attr (outstr, string, (apr_size_t) strlen (string), pool);
>> }
>>
>> +/* Escape any ASCII control chars. */
>
>
> Comment not needed at implementation, because the description is given
> at the prototype.
OK.
>> +const char*
>
>
> Space before "*".
OK.
>> +svn_xml_fuzzy_escape (const char *string, apr_pool_t *pool)
>> +{
>> + const char *end = string + strlen (string);
>> + const char *p = string, *q;
>> + svn_stringbuf_t *outstr;
>>
>> + for (q = p; q < end; q++)
>
>
> "for (q = string; ...)" would be clearer.
Hmm, I'm repeating the assignment later, for different reasons, though.
This seems more consistent.
>> + if (svn_ctype_iscntrl(*q))
>
> Space after function name.
OK.
>> + while (q < end && !svn_ctype_iscntrl(*q))
>
> Space.
OK.
>> + /* Append the control character in 0x%x format. */
>
>
> (Format is described differently in the doc string.)
Fixed.
>> + char escaped_char[7];
>
>
> You can't put declarations after code in C. (Obviously your compiler
> let you; perhaps you can increase its warnings level.) Move it to the
> top of the block.
Done.
> Seven characters: '0', 'x', xdigit, [xdigit,] '\0', and what? (It's not
> that having a couple of extra bytes is a waste of memory, it's that it
> confuses the reader: I'm now looking around trying to find what might
> use the extra bytes.)
I'm guessing the \d got me thinking there might be 6 bytes. Fixed.
>> +++ subversion/mod_dav_svn/log.c (working copy)
>> @@ -98,9 +98,10 @@
>> if (msg)
>> SVN_ERR( dav_svn__send_xml(lrb->bb, lrb->output,
>> "<D:comment>%s</D:comment>" DEBUG_CR,
>> - apr_xml_quote_string(pool, msg, 0)) );
>> + svn_xml_fuzzy_escape (
>> + apr_xml_quote_string (pool, msg, 0),
>> + pool)) );
>
>
> No. You should strip control characters before you XML-quote it,
> otherwise you are relying on (1) your function producing validly quoted
> XML (which it may do at present but is not documented to do), and
I don't think I understood your point.
Does apr_xml_quote_string require a validly-quoted XML string?
Also, svn_xml_fuzzy_escape() does not do XML in the sense of quoting,
unlike say svn_xml_escape_cdata().
> (2) the XML-quoting function accepting and passing through control
> characters (which such a function need not be expected to do).
It does do it now, which is why there is a bug report filed :)
Both the svn and apr functions seem to pass through bytes they are not
looking for.
Changing the order is not a big deal. It seems logical, though, to call
the fuzzy_escaper as the last thing before sending the XML data.
>> apr_hash_index_t *hi;
>> Index: subversion/tests/clients/cmdline/log_tests.py
>> ===================================================================
>> --- subversion/tests/clients/cmdline/log_tests.py (revision 12853)
>> +++ subversion/tests/clients/cmdline/log_tests.py (working copy)
>> @@ -516,6 +516,70 @@
>> svntest.actions.run_and_verify_svn (None, [], SVNAnyOutput,
>> 'log', '-r', '2', mu2_URL)
>>
>> +#----------------------------------------------------------------------
>> +def escape_control_chars(sbox):
>> + "make sure mod_dav_svn escapes control chars"
>> +
>> + dump_str = """SVN-fs-dump-format-version: 2
...
>> +svn:log
>> +V 37
>> +this msg contains a Ctrl-T: <- here
>
>
> Eww. Is this likely to be preserved in people's editors? Perhaps you
> could find a programmatic way of inserting it into the string without it
> having to be a literal control character in the source file.
OK, I did a quick search and didn't understand the Python code that
showed up. Could somebody with Python experience help? Perl interpolates
the value of a variable into a here document. How do I do that in Python?
Thanks for the detailed review.
Sameer
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Jan 28 11:43:17 2005