[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

bug: Incorrect UTF-16 detection

From: Sébastien Kirche <sebastien.kirche_at_free.fr>
Date: Wed, 7 Oct 2015 06:45:52 -0700 (PDT)

Hi,
I have a program that outputs some small unencoded (= ansi encoded) text files. It also incorrectly adds a final null character at the end of file.

I have noticed that for files smaller than 50 bytes it brakes the Unicode detection of TortoiseMerge that displays my small files like chinese with wrong shown encoding of utf-16le. If I artificially increase the size over 50 bytes the file is shown correctly.

It seems that the culprit is in src/TortoiseMerge/FileTextLines.cpp at lines 122 (and perhaps 153) in a hack that consists in comparing a null character count to the file size divided by 50. For files smaller than 50 bytes, any number of null characters will incorrectly result in an utf-16 display.

------------------------------------------------------
http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=4061&dsMessageId=3141293

To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_tortoisesvn.tigris.org].

diff.png
Received on 2015-10-07 17:12:25 CEST

This is an archived mail posted to the TortoiseSVN Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.