SteveKing schrieb:
> Since a text file without BOM's can't be shown correctly at all
> (without doing some guessing about the encoding) there's a standard
> which requires text files to have BOM's in it - it they don't, then
> that means they're not UNICODE (UTF-16, UTF-8, ...) but raw ASCII with
> a codepage (now guess the required codepage to show the file...).
For what I know about UNICODE, this is not entirely true.
See:
#1: unicode.org FAQ
http://www.unicode.org/faq/utf_bom.html#28
"Q: How I should deal with BOMs?
#2: rfc2376
http://www.faqs.org/rfcs/rfc2376.html
Section 5
#3 XML 1.0 W3C Recommendation
http://www.w3.org/TR/REC-xml/
Section 4.3.3
If I understand all this corectly, then the BOM on UTF-8 XML files is
(according to UNICODE and XML specs) optional, some even say that UTF-8
XML file SHOULD not have the BOM. Only UTF-16 XML entities MUST have a
BOM. The encoding is determined by the encoding declaration
(encoding="utf-8"). Even if the encoding declaration is missing, XML
parsers must (should?) assume UTF-8 if no BOM is present.
So XML editors which delete the BOM when saving UTF-8 XML files still
produce valid XML files.
On the other hand, I agree that a BOM for UTF-8 files makes much sense
on the windows platform. But what standard *requires* a BOM on all
UNICODE text files?
Norbert
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tortoisesvn.tigris.org
For additional commands, e-mail: dev-help@tortoisesvn.tigris.org
Received on Fri Nov 19 21:03:08 2004