Re: charset neutral? pls solve this

From: Greg Stein <gstein_at_lyra.org>
Date: 2002-06-01 01:08:33 CEST

On Fri, May 31, 2002 at 10:59:24PM +0100, Stephen C. Tweedie wrote:
> On Fri, May 31, 2002 at 02:25:41PM -0700, Greg Stein wrote:
>...
> > Yup. And that UCS-2 was part of my example. And on the Windows platform,
> > UCS-2 is the standard encoding for characters, so it isn't really all that
> > theoretical (well, once you get past the apparent NUL values in there and
> > being okay with casting wchar_t* to char* :-)
>
> This is now getting into "knows enough to be dangerous" territory.
> :-)

hehe... fair enough. I was referring to the BSTR type which is used quite
widely in COM interfaces (and thus, a lot of code), and is UCS-2 (possibly
UTF-16?). But the term "standard encoding" was almost definitely a bit,
umm... "off"? :-)

>...
> However, on disk, in simple notepad text documents or in emails or
> whatever, Windows is not necessarily using UCS-2. It's often using a

Yah... although, I will note that Notepad gives you an option to store in
UCS-2 :-) (and looking at my W2K box, it now appears they've expanded the
simple checkbox into choices for ANSI, Unicode (little/big-endian), and
UTF-8)

>...
> Pretty much the only advantage you get if you force all strings
> internally to UTF-8 is that when a client comes to translate one
> charset to another, it doesn't have to know anything about the
> encoding used by the original user when submitting the string in the
> first place. But then, it still has to know about that charset to
> display it, so that's really not much of a win.

Um. By using UTF-8, aren't we saying the charset is Unicode? So the fact
that it is in UTF-8 already tells you enough information to display it? Or
did I parse your sentence wrong?

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Received on Sat Jun 1 14:10:04 2002

This message: [ Message body ]
Next message: Greg Stein: "Re: vsn-rsc-url adaptations"
Previous message: Greg Stein: "Re: vsn-rsc-url adaptations"
In reply to: Stephen C. Tweedie: "Re: charset neutral? pls solve this"
Next in thread: Bill Tutt: "RE: Re: charset neutral? pls solve this"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]