[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC/PATCH] commit messages not 8-bit compatible

From: Stephen C. Tweedie <sct_at_redhat.com>
Date: 2002-05-31 12:06:50 CEST

Hi,

On Fri, May 31, 2002 at 09:23:26AM +0200, Henrik Svensson wrote:
 
> UTF-8 is actually not a character set. It is just a way to store
> unicode characters.

Amen, this is something a lot of people miss.

> When it is time to display the text to a human (sometimes called
> rendering the text) it's the client software that is responsible for
> doing it right.

Yes, but...

> If it is for some reason (eg. missing fonts) impossible
> for the client to render the characters correct. It should not try to
> do any interpretation, just replace the unknown characters with some
> known glyph (in MS-windows it is a small square).

...in this case, the client may well be in text mode, and it's not
subversion itself which is doing the final rendering. svn is just
recoding the charset into whatever octet encoding the user's terminal
is expecting, and at _that_ point, svn does need to care very much
what the charset in use is. If the UTF-8 string happens to contain
nothing except KOI-8 characters and the user's terminal is set up in
an ISO-LATIN-15 locale, then subversion may be times when the
subversion client would be better off printing a placeholder such as
"[unprintable string --- KOI-8 character set]" rather than simply
replacing the whole string with gibberish.

Cheers,
 Stephen

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 14:13:11 2002

This is an archived mail posted to the Subversion Dev mailing list.