[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: character set handling (was: Re: [proposal] --targets command line option)

From: Marcus Comstedt <marcus_at_mc.pp.se>
Date: 2002-03-01 02:37:00 CET

Greg Stein <gstein@lyra.org> writes:

> Also note that (today) we don't do the conversion on input, which
> gives us problems with i18n users. When they use a character with the
> 8th bit turned on, then things bust cuz it is fed into a UTF-8
> character parser (which promptly declares it an illegal UTF-8 char).

Conversion on output is of couse just as important as conversion on
input. If I commit the file 'r�ksm�rg�s' to the repository, I expect
it to still be called 'r�ksm�rg�s' when I check it out again, and not
some UTF-8 mumbo jumbo.

Especially troublesome is the case where the octet sequence of a
filename acually _does_ make up a legal UTF-8 sequence, because that
would allow me to commit the file to the repository, but when output
conversion is implemented, the file changes name!

To avoid such disasters, it is IMO imperative that input/output
conversion is implemented before 1.0.

  // Marcus

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:37:10 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.