[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: "Malformed XML" (l10n?) problems on checkout

From: Charles Bailey <bailey_at_newman.upenn.edu>
Date: 2005-02-16 22:34:08 CET

--On February 16, 2005 3:30:50 PM -0500 Dale Worley <dworley@pingtel.com>

>> From: Charles Bailey [mailto:bailey@newman.upenn.edu]
>> Sure. I think I've got it. By process of elimination, the
>> offending file
>> seems to be one named '.ooo^H^H^Htestff^Hile' (where ^H is
>> the usual \x08
>> backspace character). (No, I've no idea why the creator of
>> this file --
>> likely OpenOffice.org -- chose to use this name.)
> *Why* that file name exists is clear -- someone was trying to type a file
> name into a box. He typed ".ooo", and then decided he didn't like that,

That'd've been my first instinct as well. The location and name strike me
as quite unlikely for a manual save, though, so I think it might be a cute
name used by OpenOffice sometime in days past (just based on the '.ooo'

> But you've identified the Subversion problem correctly -- a file name can
> contain "non-printable" characters, which are forbidden in XML. Worse,
> what you might think is a valid escape sequence to represent it -- &8; --
> is also forbidden in XML, because "character entities" are forbidden from
> representing non-printable characters. See the discussion in
> http://www.w3c.org/TR/2004/REC-xml-20040204/#dt-charref
> Subversion may need to extend XML to allow this (and has to verify that
> its XML parser can deal with it).

Interestingly, the W3C's 1.1 recommendation and i18n FAQ seems to indicate
that numeric character references will be legal for control codes other
than NUL in XML 1.1
(<http://www.w3.org/International/questions/qa-controls>). That may mean
XML parser support isn't that unlikely. NUL is still a problem, but XML is
the least of the areas that'd bite most programs.

The suggestion of URI-encoding filenames sounds nice, though it wouldn't be
backwards-compatible. If the client and server exchange version info,
perhaps it could be a runtime selection.

Charles Bailey  < bailey _at_ newman _dot_ upenn _dot_ edu >
Newman Center at the University of Pennsylvania
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Feb 17 05:42:39 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.