[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Re: "Save to Clipboard" including BOM marker

From: Gavin Lambert <colnet_at_mirality.co.nz>
Date: Fri, 17 Jul 2015 00:47:35 -0700 (PDT)

On 17/07/2015 04:27, quoth Stefan Küng:
>> This seems incorrect. The clipboard should only contain textual
>> content; it should not include an initial BOM in any case. (*Files*
>> contain an initial BOM because there is otherwise no reliable way to
>> determine if the content is ANSI or Unicode. The clipboard does not
>> have that issue.)

BTW it may have been unclear, but in this part I'm referring to a BOM at the very start of the clipboard data (ie. what would be the initial BOM in the patch file itself), not a BOM from the diff content of the original files.

It's possible that I misinterpreted you, but it sounded like you were saying that TortoiseMerge would write a BOM to the clipboard, followed by the actual patch content (as if the clipboard were a file). That's the part that I was mainly objecting to, not whether the patch content contained another BOM or not.

If I did misinterpret this then I apologise for the noise.

>> This shouldn't happen either. The BOM should be stripped from the
>> file content prior to generating the diff.
> Sorry, but that would be a big bug.
> The diff must contain the BOM, because it would be broken if you add or
> remove the BOM from a file, and then do a diff: if a patch file would
> not contain changes to the BOM, you could not apply such a patch file
> and get the correct results.

True, although I was referring to the clipboard copy or UI display rather than the file output -- you're already showing format differences on the status bar after all.

It's a tricky one though because if the patch is copied to the clipboard and then pasted into a file editor then it doesn't seem like there's a good solution either way.

Ideally there should be some out-of-band way to signal to the patch tool to change the file encoding without that affecting the patch content. Unfortunately that seems like the sort of thing that should have been done a couple of decades ago, and may not be practical now.

That brings up a related question, though -- how would you generate a patch that would successfully convert a previously existing file from ANSI to UTF-16-without-BOM format? I don't think there's any way to represent that, unless you start making assumptions based on the format of the patch file itself (which you also lose when passing through the clipboard). I guess that's a less common scenario because UTF-16 files are always supposed to have a BOM. (Some editors will let you do it though.)


To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_tortoisesvn.tigris.org].
Received on 2015-07-17 09:47:43 CEST

This is an archived mail posted to the TortoiseSVN Users mailing list.