On Mon, 29 Nov 2004 kfogel@collab.net wrote:
> On the client side (and maybe over the wire, in DAV?) we use XML
> attributes to hold path names, at least basenames. Thus, any
> character that cannot be represented in an XML attribute value cannot
> be used in a filename. Note that this is *not* the same as
> prohibiting characters which cannot appear directly in an attribute
> value. Some characters cannot appear directly, but perhaps can be
> represented by escaping.
>
> In order to solve this, we need to know exactly what can be
> *represented* in an attribute value. Do we know the answer to that
> yet? (For example, can we represent all valid UTF8 paths in an XML
> attribute value?)
>
> Once we know the answer to that basic question, we need to decide what
> subset (possibly the entire set) we want to support.
>
> VK Sameer, is this a reasonable summary? And do you happen to know
> the answer to the first question?
>
For XML 1.0, the valid characters are specified in
http://www.w3.org/TR/2004/REC-xml-20040204/#charsets
It excludes control characters (except for whitespace), surrogates, FFFE
and FFFF. These can't be represented directly in XML, not even with
character references. It is less restrictive in XML 1.1 (you can find the
corresponding syntax production there). For interoperability, we should
stick to 1.0 for the time being. This means that if we want these control
characters in XML, we need to put some proprietary escaping on top of
this. Period. That's how simple it gets. So, couldn't we just drop the
idea of control chars in filenames? :-)
OK, time for some sleep before I get rude... :-)
Best Regards,
//Peter
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Nov 30 00:24:39 2004