[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Possible character encoding bug in svnlook

From: Kobayashi Noritada <nori1_at_dolphin.c.u-tokyo.ac.jp>
Date: 2004-09-29 21:15:41 CEST

Hi,

> svnlook log displays a strange behaviour on my SVN server. Depending on
> the locale it displays different results than svn log. I've got a commit
> message that contains a german special character 'ö'. Any other accented
> character will display this bug as well.
> I noticed this, because WebSVN which relies on svnlook, displays garbage
> here.
>
> I ssh'ed onto my server (SuSe 8.2, svn 1.1.0rc3) using Putty, which is
> by default set to ISO-8859-1. 'svn log' and 'svnlook log' behave
> different depending on the locale settings on the Linux box.
>
> Results with terminal set to ISO-8859-1 and locale set to de_DE@euro:
> svnlook log: Doppelte Datei auf dem Server gel?\246scht
> svn log: Doppelte Datei auf dem Server gelöscht
> In this case 'svn log' is correct and 'svnlook log' displays garbage.

Yes. This bug also occurs on many Japanese users' Linux/Windows environment.
(Many Japanese users have talked about this bug on a certain Japanese BBS
 where they chat,
 but it seems that they have not yet reported it to this list... X-(
 I intended to investigate and report it when I have some time but couldn't.
 Sorry for having failed to report my knowledge, and thank you, Onken.)

When we write a log in Japanese and run 'svn log', the log is displayed
correctly.
However, when we display the log with 'svnlook log', each multi-byte
character in it turns into two '?\nnn' (n is a number) such as:
  ?\179?\171?\187?\207?\164?\222?\164?\199 3 ?\187?\254?\180?\214?\164?\219
  ?\164?\201?\164?\203?\164?\202?\164?\195?\164?\191?\164?\206?\164?\199
  ?\197?\185?\164?\242?\195?\181?\164?\185?\161?\163

I hear this bug does not occur on Fedora Core, which use a UTF8 character
encoding system.
I also hear that this bug has been existing since the period of Subversion
0.37 (at the latest).

This bug is different from #1997 because displayed characters are '?\nnn',
not raw UTF8 characters.
But I attribute it to character recoding alike.

I've mentioned all I know.

Regards,

-- 
|:  Noritada KOBAYASHI
|:  Dept. of General Systems Studies,
|:  Graduate School of Arts and Sciences, Univ. of Tokyo
|:  E-mail: nori1@dolphin.c.u-tokyo.ac.jp (preferable)
|:          nori@esa.c.u-tokyo.ac.jp
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Sep 29 21:16:05 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.