[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Problems with accents in filenames

From: Dale Peakall <dale_at_peakall.com>
Date: 2003-11-24 17:24:44 CET

> Dale Peakall wrote:
>
> > This will almost never be the case. In fact I'd love to see an
> > example of a sentence where UTF-8 and UTF-16 provide the same
> > encoding. UTF-8 is a variable width encoding, whilst UTF-16 is
fixed
> > width. ...
>
> Well, no. UTF-16 is only fixed with for most Unicode characters (those

> with a character code below 0x10000). See RFC2781, section 2.

True, however, most people - incorrectly - use UTF-16 interchangably
with
UCS-2 which is a character set that is equivalent to the basic
multi-lingual plane (BMP) of unicode and happens to be generally encoded

as a fixed width 16 bit character which will be identical to UTF-16 for
all characters in the BMP.

I'm pretty sure Win32 will blow up if you try and use characters outside
the BMP, not that I've tried. If anyone has ever made this work, I'd
love to hear (off-list).

        - Dale.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Nov 24 17:33:52 2003

This is an archived mail posted to the Subversion Dev mailing list.