[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re[4]: svn diff character set handling problem

From: Wang Jian <lark_at_linux.net.cn>
Date: 2005-02-06 15:56:05 CET

Hi kfogel,

On 05 Feb 2005 23:17:11 -0600, kfogel@collab.net wrote:

> Wang Jian <lark@linux.net.cn> writes:
> > Hi kfogel,
> >
> > Here is my test
> >
> > $ cat ./mydiff
> > #!/bin/sh
> >
> > echo "$@"
> > $ svn diff --diff-cmd=./mydiff baserequestgenerator.class.php
> > Index: baserequestgenerator.class.php
> > ===================================================================
> > -u -L baserequestgenerator.class.php ޶ 956 -L baserequestgenerator.class.php .svn/text-base/baserequestgenerator.class.php.svn-base baserequestgenerator.class.php
> >
> >
> > If I deliberately set to another locale
> >
> > $ LC_ALL=zh_TW svn diff --diff-cmd=./mydiff baserequestgenerator.class.php
> > Index: baserequestgenerator.class.php
> > ===================================================================
> > svn: Can't recode string
> Hmmm. First, I don't understand why your script works as a diff-cmd.
> All it does is echo its arguments, right? It never actually runs a
> diff program. So how is it producing the output shown above?

The first one of above two tests shows correct Simplified Chinese
characters. So the calling external diff command code path looks correct,
the conversion is made.

For the builtin diff code path, like my former mail refers to, doesn't
seem to handle encoding conversion. I haven't looked at the code yet.
The Spring Festival is coming here, so I have no much time.

> In any case, I can see that the first output looks correctly encoded
> ("revision 956" versus "working copy"), and the second output shows a
> failure to encode -- which means encoding was at least attempted. I
> don't know all the reasons encoding might fail (note that r12920 made
> that error much more informative, you might want to try with the very
> latest trunk svn), but anyway it seems like this error is completely
> different from the error you were reporting in your original mail.
> There, the problem was that the diff labels were simply not in the
> right encoding for your console (that is, it was not clear whether
> encoding had been attempted or not).

I did several tests later and find the above second test's error is
bogus. The rpm system didn't install zh_TW version of subversion.mo. So
it fallbacks to zh and thus use zh_CN version of subversion.mo instead.

So please forget the second test :)

> So I'm not sure how many different bugs are being reported now.
> Perhaps you could clarify, by stating exactly what output you
> expected, versus the output you received? That would help us sort out
> the exact bug or bugs present here. If it's something beyond what we
> already have in issue #1533, then that would be useful to know.

Here is:

When zh_CN locale is used, it is expected 'svn diff' (builtin diff)
outputs in locale encoding (gb2312/gbk/gb18030, the latter is superset
of the former), however it still outputs in UTF-8.

The zh_CN version of subversion.mo is in UTF-8.

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Feb 6 15:57:01 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.