[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Bug with UTF-8 files

From: Ulrich Eckhardt <ulrich.eckhardt_at_dominolaser.com>
Date: Thu, 28 Jul 2011 09:35:27 +0200

On Wednesday 27 July 2011, you wrote:
> Hi. I'm using TortoiseSVN 1.6.16, Build 21511 and have next bug:
> patch with newly created file(s) in utf-8 codepage is applied wrong. Here
> is the explanation:
> 1. Create new file in utf-8 (without BOM)
> 2. Add to it some lines with text in few languages, that have different
> ansi codapages(eg russian(ansi - 1251) polish(ansi-1250)
> english(ansi-1252) etc)
> 3. Create patch using tortoisesvn. At this stage
> all looks fine- when you'll open patch the codepage will be treated as utf
> and all chars are ok
> 4. Revert changes to tree(or use another tree) and
> apply patch. Tortoisesvn will create needed file but it will be not in
> utf-8 but in ansi with broken non1252-chars.

Just to confirm, did you verify with a hex editor or similar tool that the
file did contain valid UTF-8 after editing (step 2) and that it didn't contain
valid UTF-8 after applying the patch (step 4)? The point is that without the
BOM some tools will apply heuristics which can and do fail.

What puzzles me is also your explanation. You say the file is "not in utf-8
but in ansi with broken non1252-chars", what exactly does that mean? If you
open a file with text encoded in UTF-8 and interpret its contents differently,
like e.g. the current single-byte codepage, of course its content is garbled.

That said, it could help if you provided the original file, the file after
editing and the patch that was generated, of course reduced to a sensible
amount of data (just a few lines, if possible).

Domino Laser GmbH, Fangdieckstra�e 75a, 22547 Hamburg, Deutschland
Gesch�ftsf�hrer: Thorsten F�cking, Amtsgericht Hamburg HR B62 932
Visit our website at http://www.dominolaser.com
Diese E-Mail einschlie�lich s�mtlicher Anh�nge ist nur f�r den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empf�nger sein sollten. Die E-Mail ist in diesem Fall zu l�schen und darf weder gelesen, weitergeleitet, ver�ffentlicht oder anderweitig benutzt werden.
E-Mails k�nnen durch Dritte gelesen werden und Viren sowie nichtautorisierte �nderungen enthalten. Domino Laser GmbH ist f�r diese Folgen nicht verantwortlich.


To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_tortoisesvn.tigris.org].
Received on 2011-07-28 09:28:32 CEST

This is an archived mail posted to the TortoiseSVN Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.