On Sat, Sep 09, 2000 at 01:08:02AM -0400, Greg Hudson wrote:
> > "XML documents may, and should, begin with an XML declaration which
> > specifies the version of XML being used."
> Okay, missed that. What about the encoding, though? UTF-8 is the
> default encoding; what value is there in specifying it explicitly? I
UTF-8 is the default, but stating explicitly that the document *is* in UTF-8
is a bit better than having a person or tool accidentally assume latin-1 or
When you read that, did you have any doubts? How about the neophyte XML user
who doesn't know the default? Maybe they will presume iso8859-1 (latin-1)
and drop some goofy characters in there. Or heck, what about our own
developers? I will lay small odds that Karl and Ben haven't considered
character sets yet. (however, this is probably a safe bet since we can treat
most of the stuff as blobs of bytes regardless of charset).
> can only imagine it getting in the way, e.g. if the transport decided
> it wanted to re-encode the document in UTF-16 and transmit it that
That document is in UTF-8. If the transport wants to recode, then it is
going to have to do that recoding regardless of whether the encoding is
specified in there. Further, the transport will probably want to insert the
new encoding into the declaration -- given that it will probably drop that
first line entirely and output a new line, then it doesn't matter whether it
has to work around an encoding="" in that declaration.
> (I don't think this is an important issue for Subversion, but since
> you are doubtless involved in a lot of other XML-using projects, and I
> probably will be in the future, I'd like to hammer out these little
Hard to say whether it is important, but it is at least *relevant*. And
discussion and learning is one of the best things about Open Source
development. There is no way that we could ever ask an NT engineer, "why did
you do <that>?" But we can ask that of each other all day long...
Greg Stein, http://www.lyra.org/
Received on Sat Oct 21 14:36:08 2006