[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: TMerge and encodings - possible bug

From: Sven Brueggemann <SBrueggemann_at_gmx.net>
Date: 2006-08-26 16:21:10 CEST

Hello,

> Sven Brueggemann wrote:
>> how does TMerge decide which encoding a file is in?
>> I have two UTF-8 files (both without BOM) - the left one
>> is displayed correctly, the right one in ASCII (double
>> byte characters as two characters).
> * BOMs have priority. If a BOM is present, the encoding set
> by the BOM is used.
> * if no BOM is present, then TMerge scans the file for invalid
> utf8 sequences. If such an invalid sequence is found, ASCII
> encoding is used
> * if no BOM is present and no invalid utf8 sequence is found,
> the utf8 encoding is used.

Thanks. I wasn't able to find an invalid sequence in the file,
but it seems that TMerge is actually causing the problem.

To reproduce:

1. Apply the attached patch and save.
2. Do a diff and see that both BASE and WC look o.k.
3. On the right side, go to line 14 (Last-Translator), mark it
and choose "Use other text block" in the context menu
4. Save
5. Diff again. You'll see the right file displayed in ASCII

When I hex diff both sides, there's no difference in the first
6KByte of the files (except the time stamp in line 13). The
offending line seems to be 352, although I can't see an
invalid sequence there either.

TSVN 1.4.0.7329, 32 Bit

BTW: Please don't commit the file - I want to discuss some of the
changes with Lübbe first, when he's back from his holidays.

Kind regards

Sven

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tortoisesvn.tigris.org
For additional commands, e-mail: dev-help@tortoisesvn.tigris.org

Received on Sat Aug 26 16:21:25 2006

This is an archived mail posted to the TortoiseSVN Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.