[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: [Subclipse-users] UTF-16 text files not always handled properly in compare editor

From: David Balažic <david.balazic_at_hermes-softlab.com>
Date: Sun, 22 Feb 2009 15:23:22 +0100

Mark Phippard wrote:

> On Fri, Feb 20, 2009 at 9:07 AM, David Balažic
> <david.balazic_at_hermes-softlab.com> wrote:
>
> > I have an UTF-16 text file in the project.
> >
> > Opening and editing it in Eclipse works fine.
> > In the Synchronize view, I can double click the file under Outgoing
> > and the compare editor opens , again correct (showing that I edited
> > one line and added a new line).
> >
> > But after commiting and doing a diff in history, it is
> wrong. Details:
> > - in Project Explorer view right click the file, select
> Team/Show History
> > - in the History view select two versions of the file, right click
> > and select Compare...
> > - in the Compare dialog leave option as they are and click OK
> >
> > Result: the compare editor show the contents, but fails to
> detects it is UTF-16.
> > So it shows a square character between each real character.
> > Basically it treats it as a 8 bit file instead of 16 bit.
> >
> > As it works fine in the Synchronize view, I guess it is just a small
> > bug in file format detection and hope it is easily fixed.
>
> I believe there is no way to know the encoding to use. When there is
> a local file involved from your workspace, it uses the encoding set on
> that file in Eclipse. When both files are from the repository the
> encoding is not known. There might be an open issue for this already.
> A year or two ago, one of our committers from Asia went through all
> the code and put this support in place (for obvious reasons). I think
> this was the only scenario it could not be done.
>
> I believe Eclipse 3.5 has enhanced the compare editor so that you can
> set the encoding on the fly. You might check that out as an option.
>
> FWIW, if it is at all possible, I'd recommend you do not use UTF-16 in
> your files. Subversion treats this files as binary (unmergeable). If
> you can use UTF-8 encoding you will get better support from
> Subversion. See:
>
> http://subversion.tigris.org/issues/show_bug.cgi?id=2194

The file was created outside of Eclipse; Eclipse figured out that
it is UTF-16 on its own. The file properties dialog in Eclipse says:
Text file encoding:
(*) Default (determined from content:UTF-16)
( ) Other:...
Byte Order Mark is UTF-16 Little-Endian (BOM)

I don't understand why would the file be treated differently by the compare editor
if once it comes from local filesystem and once from SVN. It seems the same logic
is implemented twice, slightly differently.

Regards,
David

------------------------------------------------------
http://subclipse.tigris.org/ds/viewMessage.do?dsForumId=1047&dsMessageId=1209024

To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_subclipse.tigris.org].
Received on 2009-02-22 15:23:46 CET

This is an archived mail posted to the Subclipse Users mailing list.