[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Unicode UTF-16 files detected as binary

From: Scott Palmer <scott.palmer_at_2connected.org>
Date: 2005-01-05 20:40:00 CET

On Jan 5, 2005, at 1:50 PM, Max Bowsher wrote:

> For example, I would that UTF-{16,32} are effectively binary files, in
> many ways.
> They can't be diffed, unless you teach the diff program what a lineend
> is in the new format, and they can't be displayed on most terminals,
> nor easily shown in email.
> They require special editors/viewers, just like MSWord docs require
> special editors.

Unicode is the new ASCII. The editors are already here. I.e. if
Notepad.exe can handle it you have to set the bar pretty low :)

> Anyway, that's my opinion.
>
> I think if svn is going to start treating UTF-16 as text, it at least
> needs to be taught to diff it properly.

I think this should be done. Unicode in UTF-16 is no less valid as
text than any particular 8-bit code page. The fact that old tools
don't understand it is precisely the problem that needs to be fixed.

Note that Java source code can actually be supplied to the compiler in
UTF-16 format (though I have never heard of anyone doing that) so the
need to support this isn't as odd as it might appear.

Regards,

Scott

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Wed Jan 5 20:43:18 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.