Hi all!
This problem report partly duplicates issues
#2293 and #2866.
On a Linux-2.4.36 box I use svn version 1.5.0
(r31699) which was compiled with gcc-4.3.1.
The xml output of the command
svn log --xml svn://hal.cs.berkeley.edu/home/svn/projects/trunk/cil
is not well-formed xml because characters with
their high bit set (> 0x7e) in the original text
do not get quoted.
The following patch fixes this particular
problem of incomplete quotation. It is meant as
a proof-of-concept and surely won't win any
beauty contest.
--- subversion/libsvn_subr/xml.c.orig 2008-07-06 09:38:16.000000000 +0200
+++ subversion/libsvn_subr/xml.c 2008-07-06 09:15:23.000000000 +0200
@@ -119,7 +119,9 @@
golly, if we say we want to escape a '\r', we want to make
sure it remains a '\r'! */
q = p;
- while (q < end && *q != '&' && *q != '<' && *q != '>' && *q != '\r')
+ while (q < end
+ && *q != '&' && *q != '<' && *q != '>' && *q != '\r'
+ && (unsigned) *q <= 0x7e)
q++;
svn_stringbuf_appendbytes(*outstr, p, q - p);
@@ -136,6 +138,12 @@
svn_stringbuf_appendcstr(*outstr, ">");
else if (*q == '\r')
svn_stringbuf_appendcstr(*outstr, " ");
+ else if ((unsigned) *q > 0x7e)
+ {
+ char buffer[8];
+ sprintf(buffer, "&#%u;", (unsigned) *q & 0xff);
+ svn_stringbuf_appendcstr(*outstr, buffer);
+ }
p = q + 1;
}
The casts to "unsigned" rectify the problems of
"char" being signed on some systems.
My patch leaves open a final loophole of the
control characters before "space", i.e. with a
code < 0x20.
/Chris
PS: I'm not subscribed to the mailing list, so
please e-mail me directly if there are any
questions.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: users-help_at_subversion.tigris.org
Received on 2008-07-08 00:00:22 CEST