[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC/PATCH] commit messages not 8-bit compatible

From: Marcus Comstedt <marcus_at_mc.pp.se>
Date: 2002-05-31 14:46:17 CEST

"Henrik Svensson" <innotron@telia.com> writes:

> UTF-8 is actually not a character set. It is just a way to store
> unicode characters.

When you read "UTF-8" in the discussions, you may think "Unicode"
instead if you like. The discussions are not really about
representation of Unicode characters, but about whether to translate
user input in other character sets into Unicode. If Unicode
characters are to be used, then the UTF-8 representation comes rather
natural to this application, since it avoids byte order problems and
can be inserted directly into XML. (Also, the size of wchar_t is
platform dependent, which means it can't be used for communication
between client and server. UTF-8 just uses octets.) The UTF-8
representation has its problems too, but they are mainly related to
processing individual characters, something Subversion doesn't do a
lot. If a client wants to use UCS-4 representation instead, it's easy
enough for it to convert.

  // Marcus

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 14:12:49 2002

This is an archived mail posted to the Subversion Dev mailing list.