[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Character sets for log messages

From: Henrik Svensson <innotron_at_telia.com>
Date: 2002-06-01 17:08:59 CEST

citerar Colin Putney <colin@whistler.com>:

>
> If UTF-8 is required (option 1), even a simple client must convert
> between the local encoding and UTF-8. This is non-trivial, and can get
> more complex if the local encoding can vary according the the user's
> preference. An advanced client won't find this a problem since it's
> going to be jumping through all sorts of hoops to display arbitrary
> Unicode string anyway. On the other hand, an advanced client won't
> benefit much from the implicit knowledge that log messages are in UTF-
8.
>

It is not very difficult to convert from unicode to any other charset.
Code and recommendations how to do it (for most standard charsets) are
available. Some systems even have functions that will do it for you. In
the case of a simple client that for example only can display 7 bit
ASCII it is even trivial, only remove the most significant byte from
every character in the array. The only thing the client has to consider
is that the text can contain characters that it can't convert and
print. How to handle this case has to be decided by the client
developer, but an easy solution is to replace the unprintable
characters with a simple placehoder.

/Henrik

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Jun 1 17:09:22 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.