[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: UTF-8

From: Marcus Comstedt <marcus_at_mc.pp.se>
Date: 2002-05-23 14:26:11 CEST

Greg Hudson <ghudson@MIT.EDU> writes:

> Er, I'm not sure if I missed an earlier conversation about this.
>
> Why precisely are these conversions necessary? They don't seem to be
> about UTF-8 support, but about supporting conversion from UTF-8 to other
> character sets. It seems like a lot of hair for Subversion to be
> getting into that business.

They are necessary because Subversion is not a closed world. It needs
to communicate with the operating system and the user. If they do not
use UTF-8, and Subversion does internally, conversions need to be
made.

I don't know if there has been any lengthy discussion about the design
choice to always use UTF-8 internally regardless of the character
encoding used by the operating system and user, I've only seen Greg
Stein say that it is decided (but not documented) that it should work
that way. And it does have real advantages from a version control
system perspective. It ensures that if I commit a file "räksmörgås"
to the repository, then it will still be called "räksmörgås" when
somebody else checks it out, even if this someone is using a different
locale. (Or, if the different locale doesn't even allow for the
characters "åäö", that he gets an error message rather than an
incorrectly named file.)

(Of course, other fixed internal representations than UTF-8 could have
 been chosen, but I assume UTF-8 was because A) it can handle all
 characters in frequent use and B) it is the default encoding for
 XML, and XML is used as a transport mechanism inside Subversion.)

  // Marcus

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu May 23 14:31:13 2002

This is an archived mail posted to the Subversion Dev mailing list.