Greg Hudson ghudson@MIT.EDU writes:
I think you're making the situation overly vague. I was surprised and
irritated to learn that XML attribute values are normalized, but the
rules are quite clear once you're pointed in the right direction:
http://www.w3.org/TR/REC-xml#AVNormalize
Since undeclared attributes are assumed to be CDATA, we should be okay
as long as we quote #x9, #xA, and #xD characters in attribute values.
It's possible that the rules are not implemented consistently by XML
parsers, but I have seen no evidence of that claim.
Yup, that's very clear; thanks for posting the link. I believe that
Expat was actually behaving compliantly w.r.t. that spec today, so my
claim that parser implementations could be unreliable was unreliable.
I'd never heard of XML normalization before today. When I saw Expat
turning two consecutive spaces into one, I wondered if I'd fallen into
some sort of horror movie.
I am -1 on this change unless a better argument can be presented for
making it.
If Expat is really following these rules, I agree with you. We're
going to need a pair of routines for attribute escaping/now, though,
analogous to xml_escape() and xml_unescape in libsvn_subr/xml.c.
Sigh.
(And, I'd just like to say again that I hate XML.)
http://www.rants.org/xml.html
-Karl
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 14 02:21:16 2006