[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Bug report: UTF-8 encoded files merge issue

From: Gary Hai <haiguiqing_at_gmail.com>
Date: 2005-12-31 07:42:25 CET

Great job, Stefan!. I downloaded and installed the new 1.3.0-RC1.
TortoiseMerge now works pretty good with my UTF-8 XML files with Chinese
characters in them. It is so amazed that your response is so quick. Anyway,
it is just my firstlook of the new version and i cannot sure the
encoding/decoding issue is gone.

Happy new year!

Gary Hai

>> I have tried to change the font settings, but nothing happened.
>>From the about dialog, i got this version info: tortoiseSVN 1.2.6, Build
4786 - 32 Bit .

>>According to the specification of UTF-8, it seems very hard to
>>UTF-8 encode. Anyway, if TortoiseMerge is XML-aware design, it can get
>>encode info from XML head (default as UTF-8).
>>It is very good suggestion to specify the encodeing to use becaseu there
>>already a drop down menu to selection language.

>>P.S. I have written a email to google ask for XML support in their Google
>>product, and they have added the feature very fast. All we know the XML is

>>more and more popular.

-----Original Message-----
From: Stefan Küng [mailto:tortoisesvn@gmail.com]
Sent: Friday, December 30, 2005 3:45 PM
To: dev@tortoisesvn.tigris.org
Subject: Re: Bug report: UTF-8 encoded files merge issue

Simon Large wrote:
> Kalin KOZHUHAROV wrote:
>> By default TortoiseMerge (still) does not support UTF-8 or any other
>> encoding. The only supported encoding is the default of your OS
>> (depends on the OS language).
> It *does* support UTF-8, but it requires a proper BOM at start of
> file, otherwise it cannot be sure what the encoding is. I think Stefan
> has just improved the auto-recognition so it will find UTF-8 with no
> BOM more easily now.

Yes, it now checks the whole file for UTF8 sequences and only loads the
files as UTF8 if there are
- no chars that are illegal in UTF8
- at least one UTF8 sequence is found

> Stefan, how hard would it be to add a menu to allow the user to
> specify the encoding to use? Default would be 'Auto' as it is now, but
> the user could override that if needed.

There's a little problem with that: the files are *loaded* in a specific
encoding. So it you want to switch the encoding, the files would have to be
reloaded (and so you would loose all your modifications).

I'd rather have someone else implement this feature, because I can't really
test this fully. I mean, even if e.g. Shift-JIS chars would be shown
correctly on my machine, I couldn't tell if it's correct because I just
can't read those.


   oo  // \\      "De Chelonian Mobile"
  (_,\/ \_/ \     TortoiseSVN
    \ \_/_\_/>    The coolest Interface to (Sub)Version Control
    /_/   \_\     http://tortoisesvn.tigris.org
To unsubscribe, e-mail: dev-unsubscribe@tortoisesvn.tigris.org
For additional commands, e-mail: dev-help@tortoisesvn.tigris.org
To unsubscribe, e-mail: dev-unsubscribe@tortoisesvn.tigris.org
For additional commands, e-mail: dev-help@tortoisesvn.tigris.org
Received on Sat Dec 31 07:41:44 2005

This is an archived mail posted to the TortoiseSVN Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.