[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Include offending XML in "Malformed XML" error message

From: Peter N. Lundblad <peter_at_famlundblad.se>
Date: 2005-02-27 22:25:11 CET

On Sat, 26 Feb 2005, Charles Bailey wrote:

> On Thu, 24 Feb 2005 14:17:20 +0100 (CET), Peter N. Lundblad
> <peter@famlundblad.se> wrote:
> > On Wed, 23 Feb 2005, Charles Bailey wrote:
> >
> > Good start:-) A few commets below. (Also, could you please try to get the
> > MIME type of your attachments as some text mime type Makes it easier to
> > review.)
> Definitely. I put a little too much faith in the mail client's
> ability to recognize text; for now, I'll just inline the patch, since
> it's small.
That's fine.

> A revised patch is appended below my sig, with appropriate attention
> to l10n. I've taken idiom from other messages where I can (z.B.
> 'beginnen' v. 'anfingen'). More fluent readers may want to pitch in
> on the middle European and Scandinavian languages, where my morphology
> may be laughably in error, and I've punted almost entirely on the east
> Asian languages, but I hope it'll serve as a decent start nonetheless.
You needn't bother with the .po files in a code patch. The translators
will take care of that. (We generally don't expect patch submitters to
provide translations of all 10 languages:-)

> Index: subversion/libsvn_subr/xml.c
> ===================================================================
> --- subversion/libsvn_subr/xml.c (revision 13160)
> +++ subversion/libsvn_subr/xml.c (working copy)
> @@ -417,9 +417,9 @@
> {
> err = svn_error_createf
> - _("Malformed XML: %s at line %d"),
> + _("Malformed XML: %s at line %d; XML starts:\n%.240s"),
> XML_ErrorString (XML_GetErrorCode (svn_parser->parser)),
> - XML_GetCurrentLineNumber (svn_parser->parser));
> + XML_GetCurrentLineNumber (svn_parser->parser), buf);
I discussed this with Erik Heulsmann on IRC and our conclusion is that we
need to be mroe careful with encodings here. As I stated earlier, we have
no guarantee that this string will be valid UTF8 (and we can't know from
the error either, since the encoding error might be after the error being
reported). We have code in utf_valid.c to decide if a string is valid
UTF8. You need to use this. We have code in utf. (fuzzy_escape) to fuzzily
escape non-ASCII characters in a string. Export this function as an
internal "API" (svn_utf__fuzzy_escape) and use it if the part of the
string is invalid UTF8. To make it even more complicated, you should
ensure that you cut the string on an UTF8 boundary. There is code for this
in svn_ctype.h. Make sure you break the string before a LEAD character or
ASCII character. This will avoid unnecessary escaping.

As so often, a patch turns out to be a little more work when anticipated
at the first glance... But, please continue. It will help us much in the


To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Feb 27 22:24:10 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.