[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Action request: mime-type of xml-dtd should be treated as text

From: John Peacock <john.peacock_at_havurah-software.org>
Date: Tue, 05 Feb 2008 12:36:25 -0500

Mark Irving wrote:
> An XML DTD is often, but not always, prepared with a text editor
> or a syntax-aware text editor. Exactly the same claim can be
> made about, say, a C++ source file. If SVN presents C++ source
> as text, shouldn't it do the same for application/xml-dtd? The
> argument is weaker for application/xml, which is more likely to
> be edited with a specialized program, but is often text.

I've already responded several times to these threads explaining that in
the generic case, all XML files are not "text documents" from the point
of view of Subversion (or ordinary diff tools for that matter). So far,
no one seems to believe me, perhaps because I have not been using the
appropriate language. Let me try again.

XML documents are *structured* documents that just so happen to be
[usually] stored as textual files. By this I mean that the actual
makeup of the documents themselves are ASCII (or possibly UTF-8)
characters (which would normally be considered "text"); I'm ignoring
CDATA blocks for the moment. But the overall structure of any XML
document is not necessarily fixed; there are transformations of the
textual representation that are equivalent XML documents.

The classic example is attributes, which are by definition an unordered
list that apply to a given element. You can change the order of these
attributes in the XML file itself, and yet the XML document is
identical. This is why there are syntax-aware XML editors and diff
programs that can handle this.

In fact, under the XML 1.0 specification, the elements themselves in a
well-formed document can be considered unordered as well, see this
discussion:

        http://www-128.ibm.com/developerworks/xml/library/x-eleord.html

Just because most of the time, an XML parser will return the elements in
document order, doesn't mean that this is the only valid representation
of this particular XML document. There are several XML parsers that I
am aware of that reorder the document elements and attributes (since
they use hashes during the parsing process) as a matter of course.

Does this make more sense? In many cases, if there was a way to tag a
given file as being textual under SVN (through the use of a new
svn:textual attribute for example), then diff and blame would do the
right thing. But this would be accidental in the sense that some other
tool could rewrite the XML document to a completely equivalent document
and diff and blame would be completely useless.

Some of the future development of Subversion will probably include a way
to map a specific external tool to a specific MIME file type. This
would allow you to use an XML diff tool for comparing changes to files,
just as it would allow you to use an image editing tool to compare
changes to jpeg's for example.

John

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: users-help_at_subversion.tigris.org
Received on 2008-02-05 18:36:52 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.