[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Include offending XML in "Malformed XML" error message

From: Peter N. Lundblad <peter_at_famlundblad.se>
Date: 2005-03-01 09:00:34 CET

On Mon, 28 Feb 2005, Charles Bailey wrote:

> On Mon, 28 Feb 2005 22:39:13 +0100 (CET), Peter N. Lundblad
> <peter@famlundblad.se> wrote:
> > On Mon, 28 Feb 2005, Charles Bailey wrote:
> >
> > YOu misunerstood me. I didn't mean we need to escape in general (we
> > already does in the cmdline output routines to be safe). User input is
> > converted from the native encoding to UTF8 rather early, so normally we
> > rely on strings being valid UTF8. This is a special case, since, if I
> > understand correctly, it is raw XML from the parser. We rely on the parser
> > doing the recoding to UTF( for us, but since this is an error situation,
> > the data might not be valid UTF8. That's why we need this ugly escaping in
> > this case (and when reporting recoding errors in utf.c).
>
> Fair enough. This is where my brief experience with svn limits me.
> Depending on how "UTF-8-safe" svn needs to be, my point may still
> apply to any data read from a file, however. I think that would
> include not only XML from admin files, but property names and values,
> and any fragments from base or working revisions. This protects
> against direct edits to the files, as well as errors elsewhere in svn
> (as was the case with the offending XML here).
>
Yes, you have a good point in that we don't always make sure our input is
valid UTF8 (especially from the network). this probably needs to be fixed
someday. But it should be done as input validation. Internally, we must be
able to assume that strings are valid UTF8.

> Mind you, I'm not advocating this -- I think it's a lot of work to
> guarantee that a non-UTF-8 character is never presented to an external
> library or to the user. I'll work on the XML parse error as a special
> case, and leave the broader policy decisions for a later time.

Good. Note that this really is input validation. that's why we need to be
careful.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Mar 1 08:59:28 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.