[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Include offending XML in "Malformed XML" error message

From: Charles Bailey <bailey.charles_at_gmail.com>
Date: 2005-02-28 18:11:27 CET

On Mon, 28 Feb 2005 08:11:33 +0100 (CET), Peter N. Lundblad
<peter@famlundblad.se> wrote:
> On Sun, 27 Feb 2005, Charles Bailey wrote:
>
> > On Sun, 27 Feb 2005 22:25:11 +0100 (CET), Peter N. Lundblad
> > <peter@famlundblad.se> wrote:
> [snip discussion about ensuring that the error stirng is valid UTF8]
>
> > This looks like a SMOP, but I'm not sure what the real benefit is.
> > Does Subversion make the guarantee that all of its output will be
> > UTF-8? Unless that's a major goal, I think there's something to be
> > said for inserting the offending XML "as is" into the error message.
>
> Not its output, but the strings internally. With few exceptions our APIs
> work with UTF8 internally. The problem with adding "raw data" is that the
> error message will be invalid UTF8 and you will have recoding errors
> later. Our own error output routines will escape the whole string then,
> but that might not other libraries do. So, yes, keeping our strings valid
> UTF8 is a goal.

OK. I'll give it a go when I get a block of free time.

> > If it is a general policy to convert to UTF-8, should I code this as a
> > separate function, rather than putting the logic into parse_xml?
> >
> You can put it in a separate function. Keep it internal to the file,
> though, until we see another use case for it. We can export it if that
> happens.

As a first pass, the '%s' token appears in 410 error messages in HEAD.
 On rapid inspection, about half appear to take internal strings,
which I would think are more likely to be valid UTF-8 (though the
error motivating my original patch was an internal string, so that
should be taken with the proverbial grain of salt). The others are
most often user input, so may or may not be valid UTF-8 depending on
the user's locale settings; I haven't traced code to see how often
they're already escaped. How much of this needs coverage, though,
looks to me like a question for the longer term and for more
experienced svn hands than I.

--
Regards,
Charles Bailey
Lists: bailey _dot_ charles _at_ gmail _dot_ com
Other: bailey _at_ newman _dot_ upenn _dot_ edu
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Feb 28 18:12:41 2005

This is an archived mail posted to the Subversion Dev mailing list.