[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: "Malformed XML" (l10n?) problems on checkout

From: Charles Bailey <bailey_at_newman.upenn.edu>
Date: 2005-02-16 23:40:55 CET

--On February 16, 2005 3:22:33 PM -0600 kfogel@collab.net wrote:

> "Dale Worley" <dworley@pingtel.com> writes:
>>
>> But you've identified the Subversion problem correctly -- a file name can
>> contain "non-printable" characters, which are forbidden in XML. Worse,
>> what you might think is a valid escape sequence to represent it -- &8;
>> -- is also forbidden in XML, because "character entities" are forbidden
>> from representing non-printable characters. See the discussion in
>>
>> http://www.w3c.org/TR/2004/REC-xml-20040204/#dt-charref
>>
>> Subversion may need to extend XML to allow this (and has to verify that
>> its XML parser can deal with it).
>
> Thanks for the analysis and the reference, Dale. You're right, and
> Subversion's way of dealing with this is that (now) it no longer
> permits such paths in the repository. See issue #1954, and see r12581
> and r12632. There is a long thread on the topic, linked to from the
> issue I believe, that explains the reasoning behind the decision.

Thanks; that's quite helpful. I'm sorry to have missed it in the archive.
I was too focused on the specific error, and should have searched for any
XML-related issue.

> So Charles, your solution right now is to rename that file in the
> repository (perhaps even via dump/transform/load, so it's fixed in

No problem. It's a test repository; we can just drop and recreate it.
It's helped to give me some idea of how svn might oddball file names
imported from outside vendors' packages, but there's no history in the
repository that's critical.

> history, if that's feasible for your team). Future versions of
> Subversion won't allow that path to be in the repository in the first
> place.

That's a big help.

Two quick questions:
- Is it worth a small patch to xml.c to include the contents of the
offending buffer in a "Malformed XML" error message? It could yield a
long/multiline message, but would help identify the offending text in a
large operation.
- Is it worth adding a bit of text about path name requirements to the
docs? Would it go in the Book (e.g. the UTF/Path requirements bit of the
developer info, or even a brief blurb about character sets in Ch 1), or
elsewhere?

--
Regards,
Charles Bailey  < bailey _at_ newman _dot_ upenn _dot_ edu >
Newman Center at the University of Pennsylvania
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Feb 17 05:43:45 2005

This is an archived mail posted to the Subversion Dev mailing list.