Re: Character sets for log messages

From: Henrik Svensson <innotron_at_telia.com>
Date: 2002-06-02 00:25:16 CEST

citerar Nuutti Kotivuori <naked@iki.fi>:

> Colin Putney wrote:
> > On Saturday, June 1, 2002, at 08:08 AM, Henrik Svensson wrote:
> >> It is not very difficult to convert from unicode to any other
> >> charset.
>
> [...]
>
> >> In the case of a simple client that for example only can display 7
> >> bit ASCII it is even trivial, only remove the most significant byte
> >> from every character in the array.
>
> [...]
>
> > Well, just stripping off the high bit would leave garbage characters
> > wherever there are multibyte sequences, so you'd have to be able to
> > recognize those sequences and deal with them appropriately.
>
> Well, assuming that the 'unicode' above would mean an UTF-8 encoded
> string - and assuming Henrik meant that remove characters which have
> the most significant _bit_ set from the array.
>
> All multibyte sequences in UTF-8 consist of only characters with the
> most significant bit set. So there would be no garbage, just stripping
> of everything non-ASCII.
>
> -- Naked
>
That one way to do it. I described a faulty algorithm in my posting,
but thats not important at the moment. The important thing is that it
would be simple to make the current default subversion client and all
future clients able handle the scripts defined in the unicode standard
gracefully. At least I think that is a good thing.

Henrik

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Jun 2 04:25:30 2002

This message: [ Message body ]
Next message: Faried Nawaz: "freebsd svn port installs r1868."
Previous message: Henrik Svensson: "Re: Character sets for log messages"
Maybe in reply to: Colin Putney: "Character sets for log messages"
Next in thread: Greg Stein: "Re: charset neutral? pls solve this"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]