[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: SVN Blame Returns Corrupt Data

From: Branko Čibej <brane_at_wandisco.com>
Date: Fri, 11 Oct 2013 16:59:19 +0200

On 11.10.2013 16:55, Bob Archer wrote:
>> On 11.10.2013 15:58, Bob Archer wrote:
>>>> On Thu, Oct 10, 2013 at 5:49 PM, Bob Archer <Bob.Archer_at_amsi.com>
>> wrote:
>>>> I assume he was asking how to "fix" the blame. Cause, sure, he could
>>>> open the file, convert it back to UTF-8 with CRLF line endings... and
>>>> commit it... of course, now blame is going to show him on every line,
>>>> since he just changed every line.
>>>>
>>>> That's exactly what I meant. You're correct with how the blame is
>>>> handled. I committed the UTF-8 copy to a test branch, diff'd, and it
>>>> showed every line as being changed. Unfortunately it looks like this is our
>> best option.
>>> Yep, we have done the same thing. As a matter of fact, I just over the past
>> few days rescripted all our database scripts to be UTF-8 since merging them
>> just doesn't work correctly when they are UTF-16 even if you remove the
>> binary mime type.
>>>> On Thu, Oct 10, 2013 at 7:07 PM, Ben Reser <ben_at_reser.org> wrote:
>>>> At current blame is not UTF-16 aware.
>>> It's not just blame that isn't... the diff engine, or whatever detects file
>> types always considers UTF-16 files to be binary. If you "add" a UTF-16 file
>> you see that svn adds the application/octet-stream mime type. There is an
>> issue in the bug database about this from when I reported/complained about
>> it... however it hasn't been addressed. I'm surprised still at this time that svn
>> still can't support UTF-16 text files as text wrt adding, diffing, blaming, etc.
>>
>> It's quite simple: no-one has written the necessary code. While I can
>> understand it's an interesting feature for Windows users, most Subversion
>> developers have other things to do. This being a volunteer project, and most
>> of us do not use Windows, you can hardly expect anyone to spend several
>> weeks on solving a problem that has a perfectly simple workaround. Since
>> UFT-8 and UTF-16 can be interchanged without data loss, there are other,
>> much more important things to do in Subversion.
> I appreciate all that you said. I didn't expect that UTF-16 was so uncommon in non-Windows OSes. A large number of dev tools that I work with on Windows, especially the Microsoft tools default to creating UTF-16 files.
>
> I disagree with your "can be converted without data loss". If you need UTF-16 then you need it. Also, if you are working in an international team and you have developers with other language Oss which have different code pages then what you see when you look at a UTF-8 file might be different than what I see.

I don't follow. Both UTF-16 and UTF-8 are complete representations of
the Unicode character set. Exactly the same code sequences can be
represented in both encodings. You can convert from UTF-16 to UTF-8 and
back and get exactly the same sequence of bytes.

-- Brane

-- 
Branko Čibej | Director of Subversion
WANdisco // Non-Stop Data
e. brane_at_wandisco.com
Received on 2013-10-11 17:00:06 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.