[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: SVN Blame Returns Corrupt Data

From: Branko Čibej <brane_at_wandisco.com>
Date: Fri, 11 Oct 2013 21:23:38 +0200

On 11.10.2013 18:52, Ben Reser wrote:
> On 10/11/13 9:22 AM, Branko Čibej wrote:
>> You'd have to extend Subversion's file type detection to detect UTF-16.
>> See svn_io_detect_mimetype2 in line 3333 in this file:
>>
>> http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_subr/io.c?view=markup
>> Subversion currently only looks at the first 1k Bytes of a file. It may
>> be enough to check that this initial part of the file contains only
>> valid UTF-16 (BE or LE) codes.
> Even if all we looked for is the BOM it might be helpful enough. I suspect the
> development tools producing UTF-16 are including BOMs. Windows seems to be
> fond of including them, Notepad puts one even on UTF-8.

That would work only on Windows. On other platforms, you typically don't
get a BOM (actually, a zero-width non-breaking space) at the beginning
of a file. Granted, other platforms most likely use UTF-8 in any case.

-- Brane

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. brane_at_wandisco.com
Received on 2013-10-11 21:24:31 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.