[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: SVN Win32 Developers -- need some help

From: Stefan Küng <tortoisesvn_at_gmail.com>
Date: 2007-05-30 21:07:21 CEST

Mark Phippard wrote:

>> TortoiseSVN "seems" to be one of the main ide/tools of choice for
>> Windows SVN users... this is surmised based on the fact that there have
>> now been over 4,500,000 downloads of it from Sourceforge.
>> I don't know how many of these are non-english users, but some of the
>> main Tortoise devs seem to be.

Yes, I am from Switzerland, and we have some non-ASCII chars in our
language (e.g., öäü, and of course some french chars too, since
Switzerland has four languages).
I can also tell that even Chinese and Russian chars are correctly converted.

>> Doesn't this mean that, in a sense, it has been effectively field tested
>> under various locales for some time?

Yes, it has been tested for some time now. But since that change is only
in the TSVN trunk, not that many people are really using it yet.

>> The only objection in the original thread seems to come from someone's
>> recollections about incomplete utf8 support under Windows.

Yes, and such arguments simply can't be opposed. You see, there is no
way to prove that it really works *exactly* as apr-iconv, at least not
that I know of a way.

>> I recall PostgreSql running into something like this as well so I
>> googled over there and came up with the following:
>> http://archives.postgresql.org/pgsql-odbc/2006-03/msg00227.php
>> and
>> http://pginstaller.projects.postgresql.org/faq/FAQ_windows.html#2.6
>> Perhaps there is some useful "facts" there.

If you read those you will see that it's not really a windows problem
but a problem with code pages. Of course it's not possible to convert
all utf8 sequences into one codepage. And windows has its own local
codepages (because they're not really standardized (is that a word?)).
And since each OS has its own local (custom) codepages, you can get into
troubles. But: as it also says there: both MultiByteToWideChar() and
WideCharToMultiByte() work just fine, and that's all we use.

> Stefan did point out one limitation currently and that is support for
> the --encoding option on the command line. This is not an issue for
> TortoiseSVN because it does not expose that option to users. But
> apparently that option allows you to specify an arbitrary encoding and
> he did not find an API that he could use for that.
> It is possible he did not try very hard since he did not need it.
> Personally, I'd rather lose that option in the Windows command line
> than live with APR ICONV any longer.

Well, we could add some kind of table which maps the code-page strings
to the windows code-page defines. But since I had no need for that, I
skipped it (also because I had no intention to go through all the
trouble again to get my patch accepted).


   oo  // \\      "De Chelonian Mobile"
  (_,\/ \_/ \     TortoiseSVN
    \ \_/_\_/>    The coolest Interface to (Sub)Version Control
    /_/   \_\     http://tortoisesvn.net
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed May 30 21:07:39 2007

This is an archived mail posted to the Subversion Dev mailing list.