[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Subversion Unicode Support

From: Norbert Unterberg <nepo_at_gmx.net>
Date: 2004-11-21 16:55:54 CET

Ulrich Eckhardt schrieb:

> Subversion is totally ignorant of underlying file content, it treats all files
> as binary blobs.

I think that this decision was not all the best when designing
subversion. After all, Subversion has support for text files (it
supports different CR/LF styles). Subversion would have been a better
system if it would treat text files as special files: A text file is a
sequence of lines, that have a particuar encoding (UTF-8, UTF-16,
with/without BOM, ASCII) and a particular end-of-line style (CRLF, CR,
LF). Then many of these strange problems just weren't there.

However, I'm saying this without deeper knowledge of subversion's and
character encoding details, and without much thinking. Maybe there is
much more behind this as I can see now.

> If I were you, I
> would consider a) dropping UTF-16 altogether and b) storing files in UTF-8,

This would not be easy.
a) The native encoding for WIN32 UNICODE applications is UTF-16, and it
would require an additional resource handling layer to switch to UTF-8.
b) We also edit resource text files for an embedded target that uses
UTF-16 encoding.

Changing the implementation of a project just because a tool lacks some
  features would not be a good idea. However, in our current project
there are few UTF-16 files, all the source files are still encoded in
the good old 8 bit Windows ANSI code page 1252.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sun Nov 21 16:58:20 2004

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.