[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Problems Encoding

From: Eric Lemes <ericlemes_at_gmail.com>
Date: 2006-06-09 15:02:30 CEST

On 6/9/06, Samuel Langlois <slanglois@ilog.fr> wrote:
>
> Hello,
>
> I have the same thing in French localization:
> C:\>svn --version
> svn, version 1.3.2 (r19776)
> compil?\195?\169 May 26 2006, 13:10:00
>
> It seems svn spits UTF-8, which the basic Windows Command Prompt cannot
> handle.
> Setting the LANG environment variable to en_US switches to English
> messages, which is an acceptable workaround for me.
>

Well.. I think he doesn't spit UTF-8. I'm trapping the stdout with a C# app,
and reading the input with utf-8 encoding for svn --xml, I got all chars
okay. Trapping svnlook stdout I got scrambled chars.

I think svn command line tools uses Windows Regional settings for his
localization and he spits chars in the "codepage" configured in Regional
Settings "Advanced" Tab (in the "select a language for non-unicode
programs). My problem is that when I set this to "Brazilian Portuguese", svn
tools spit good chars in windows console, but I can't trap the stdout from
my C# app in UTF8 or Default (ANSI Encoding, iso-8859-1 I think).

The LANG=en_US didn't affected anything.

I did a test with a common text-file, saved as ANSI with weird chars: "çã".
In a HEX editor, these to chars goes as E7 and E3.

With a svnlook > textfile.txt (as ANSI too), I got 87 and C6 for the same
chars.

Seeing the output of the UTF-8 file (parsed from svn --xml), I got two bytes
for every weird char: "ç" = C3 A7, "ã" = C3 A3.

Thanks anyway,

[]'s

Eric Lemes
Received on Fri Jun 9 15:04:46 2006

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.