There is a problem in the binary/text detector from Subversion 1.0.0 (Win32).
The Unicode standard defines a so called byte-order mark. This is usually
placed at the begining of a Unicode plain text file. This marker can
have these representations:
EF BB BF - UTF-8
FE FF - UTF-16/UCS-2, little endian
FF FE - UTF-16/UCS-2, big endian
FF FE 00 00 - UTF-32/UCS-4, little endian
00 00 FE FF - UTF-32/UCS-4, big-endian
When you save a plain text file as Unicode from Notepad (Windows XP)
it adds this mark at the beginning of the file. But then if you add
that file to a Subversion repository, it's marked as
application/octet-stream. If you remove the byte-order mark and add it
again (under a different name, of course), it doesn't mark it as
application/octet-stream.
More info and some ideas on how to determine if a file is Unicode:
http://msdn.microsoft.com/library/en-us/intl/unicode_42jv.asp
http://msdn.microsoft.com/library/en-us/intl/unicode_81np.asp
Adal Chiriliuc
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Mar 6 23:06:21 2004