[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Support for multibyte character encodings, particularly in the diff/merge code

From: Peter N. Lundblad <peter_at_famlundblad.se>
Date: 2005-06-26 20:48:26 CEST

On Fri, 24 Jun 2005, Alastair Houghton wrote:

> I was wondering what the status was with regard to support for
> multibyte encodings (and in particular those that don't look like
> ASCII, such as UTF-16 or UCS-2)?
I've been thinking loudly in
http://svn.collab.net/repos/svn/trunk/notes/diff-encoding.txt. That only
talks about diff and merge support and I'm not planning to work on it in
the near future.

> The reason I'm asking is that I want to store some OS X .strings
> files in Subversion; currently they are marked as binary and
> Subversion refuses to diff them (and will probably not merge them by
> itself). How difficult would it be to add the necessary code to
> support such cases in the diff library?
I don't think the diff library would be very hard. YOu'd need to learn the
library how newlines look in different encodings. But you need to do
substantial work in the WC library as well, since you need to support
newline/keyword translation as well.

> I wouldn't mind if it didn't support cross-encoding diff/merge, as
> that's complicated and users can easily do that kind of diff by using
> iconv to convert both files to the same encoding first, but surely it
> wouldn't be too hard to get it to support the same-encoding case?
That might be a reasonable limitation, but be sure we'd get "bug reports"
about it...

> On an ancillary note, will Subversion treat files as binary if the
> svn:mime-type is set to something like
> text/plain; charset=UTF-16
It will treat the files as text, but the merging will be incorrect and
produce invlaid UTF16.


To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sun Jun 26 20:50:11 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.