[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Problems with accents in filenames

From: Dale Peakall <dale_at_peakall.com>
Date: 2003-11-24 16:46:29 CET

> Julian Reschke <julian.reschke@gmx.de> writes
> > Vincent Lefevre wrote:
> > > ...
> > > Some languages can be encoded in UTF-16, which is somewhat
> > > compatible with UTF-8, so no problem here.
> >
> > UTF-16 and UTF-8 are both *encodings* of the same character set
> > (Unicode) and therefore can encode the exact same set of languages.
>
> I think Vincent knows this, and that what he means by
> "somewhat compatible" is that for some strings in some
> languages, the UTF-8 encoding and the UTF-16 encoding will be
> exactly the same sequence of bytes.

This will almost never be the case. In fact I'd love to see an
example of a sentence where UTF-8 and UTF-16 provide the same
encoding. UTF-8 is a variable width encoding, whilst UTF-16 is
fixed width.

For most western characters (i.e. the ASCII set) the UTF-8 and
UTF-16 will never be the same as UTF-8 never has any \x0 characters
and UTF-16 will have a \x0 character associated with every character.

        - Dale.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Nov 24 17:00:36 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.