[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Control characters in log message cause failure

From: Branko Čibej <brane_at_xbc.nu>
Date: 2004-12-01 05:10:36 CET

Ben Reser wrote:

>On Mon, Nov 29, 2004 at 11:25:03PM +0000, Philip Martin wrote:
>
>
>>One of us is confused, or perhaps is just terminology.
>>
>>There is no "mixed UTF8/non-UTF8 string" and there are no "non-UTF8"
>>characters that need to be "converted". There may be ASCII control
>>codes in the log message, and if these are not valid XML then they
>>need to be rejected or escaped, but the only place that UTF-8 comes in
>>is that ASCII control codes are encoded unchanged in UTF-8.
>>
>>It looks like we have the same problem with paths in the entries file:
>>
>>$ svn mkdir wc/`printf "\x18"`
>>$ svn st wc
>>../svn/subversion/libsvn_wc/entries.c:671: (apr_err=130003)
>>svn: XML parser failed in 'wc'
>>../svn/subversion/libsvn_subr/xml.c:365: (apr_err=130003)
>>svn: Malformed XML: not well-formed (invalid token) at line 13
>>
>>
>
>I was under the impression that Unicode disallowed control characters
>with the exception of tab, carraige return and line feed. The XML
>specification certainly gives me that impression:
>http://www.w3.org/TR/2004/REC-xml-20040204/#NT-Char
>
>Unfortunately, I don't have access to the Unicode standard to be sure.
>
>
Unicode has nothing to do with it, and it certainly encodes all ASCII
control characters (after all, Unicode codepoints U+0000 to U+007F _are_
ASCII). IMHO whoever wrote that part of the XML spec seems to have
forgotten that ASCII control characters are valid Uincode characters.
However, the fact remains that there is no way to represent control
chars (except \t, \r and \n) in XML 1.0 without using some custom
excaping mechanism.

(BTW, note that \x7f is also a control character, according to ASCII and
Unicode)

-- Brane

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Dec 1 05:11:02 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.