[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: "Malformed XML" (l10n?) problems on checkout

From: Dale Worley <dworley_at_pingtel.com>
Date: 2005-02-16 21:30:50 CET

> From: Charles Bailey [mailto:bailey@newman.upenn.edu]

> Sure. I think I've got it. By process of elimination, the
> offending file
> seems to be one named '.ooo^H^H^Htestff^Hile' (where ^H is
> the usual \x08
> backspace character). (No, I've no idea why the creator of
> this file --
> likely OpenOffice.org -- chose to use this name.)

*Why* that file name exists is clear -- someone was trying to type a file
name into a box. He typed ".ooo", and then decided he didn't like that, so
he typed ^H three times, which moved the cursor back three spaces, then he
typed "test", which wrote over the offending "ooo", but didn't actually
remove them from the program's input buffer. Similarly, the final ^H was to
correct the second "f" so he could replace it with "i". The name he thought
he was getting was ".testfile".

But you've identified the Subversion problem correctly -- a file name can
contain "non-printable" characters, which are forbidden in XML. Worse, what
you might think is a valid escape sequence to represent it -- &8; -- is also
forbidden in XML, because "character entities" are forbidden from
representing non-printable characters. See the discussion in

http://www.w3c.org/TR/2004/REC-xml-20040204/#dt-charref

Subversion may need to extend XML to allow this (and has to verify that its
XML parser can deal with it).

Dale

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Feb 16 21:36:12 2005

This is an archived mail posted to the Subversion Dev mailing list.