RE: Problems with accents in filenames

From: Dale Peakall <dale_at_peakall.com>
Date: 2003-11-24 16:46:29 CET

> Julian Reschke <julian.reschke@gmx.de> writes
> > Vincent Lefevre wrote:
> > > ...
> > > Some languages can be encoded in UTF-16, which is somewhat
> > > compatible with UTF-8, so no problem here.
> >
> > UTF-16 and UTF-8 are both *encodings* of the same character set
> > (Unicode) and therefore can encode the exact same set of languages.
>
> I think Vincent knows this, and that what he means by
> "somewhat compatible" is that for some strings in some
> languages, the UTF-8 encoding and the UTF-16 encoding will be
> exactly the same sequence of bytes.

This will almost never be the case. In fact I'd love to see an
example of a sentence where UTF-8 and UTF-16 provide the same
encoding. UTF-8 is a variable width encoding, whilst UTF-16 is
fixed width.

For most western characters (i.e. the ASCII set) the UTF-8 and
UTF-16 will never be the same as UTF-8 never has any \x0 characters
and UTF-16 will have a \x0 character associated with every character.

- Dale.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Nov 24 17:00:36 2003

This message: [ Message body ]
Next message: C. Michael Pilato: "Re: [PATCH] Basic SMTP AUTH support for mailer.py"
Previous message: Julian Reschke: "Re: Problems with accents in filenames"
In reply to: kfogel_at_collab.net: "Re: Problems with accents in filenames"
Next in thread: Julian Reschke: "Re: Problems with accents in filenames"
Reply: Julian Reschke: "Re: Problems with accents in filenames"
Reply: kfogel_at_collab.net: "Re: Problems with accents in filenames"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]