[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Can't recode string (locale problem ?)

From: Dimitri Papadopoulos-Orfanos <papadopo_at_shfj.cea.fr>
Date: 2004-06-21 13:38:28 CEST

Hi,

>>This is somewhat misleading. Yes, the repository keeps filenames in UTF-8,
>>so does the client internally. This has nothing to do with the encoding of
>>the local filesystem.
>
>
> OK, I think this time I understand it right ...

Again, I don't think my comment was misleading. It seems something's
wrong somewhere with filename recoding. This precisely happens when
recoding from the internal encoding of Subversion (UTF-8) to the local
filesystem encoding.

>>Sure, because you are telling Subversion that your filenames are UTF-8
>>encoded when in fact they are not. The "Can't recode string" error message
>>usually means that you did not set the LANG or LC_CTYPE env vars at all; in
>>those cases Subversion doesn't know what the source encoding is, so it
>>can't recode.

Shouldn't it fall back to a C locale in that case?

>>Assuming that LANG and LC_CTYPE were not set at all, you need to set them
>>to fr_CH, unless you recode all filenames on your system to be UTF-8
>>compliant...

Actually LANG and LC_CTYPE were set correctly, see first posts in this
thread.

> I had LANG and LC_CTYPE set to fr_CH from the begining ! On the local
> filesystem everything is OK, the filenames are displayed correctly in the
> shell and in any application. Still, subversion does not seem to understand

If both LANG and LC_CTYPE are set to fr_CH, Subversion should be able to
recode correctly. However the filesystem encoding under that locale on
Linux is probably ISO 8859-1, not UTF-8. As already pointed out, this
shouldn't be an issue, Subversion should be able to recode from ISO
8859-1 to UTF-8 and back.

> them right ... What can I do ? Is there another LC_* variable that
> subversion cares about ? Is it possible that the locale definition is

There's no other variabel to set.

Actually I would suggest *unsetting* LC_CTYPE and setting only LANG,
this seems to be the default on most distributions. Maybe this will help
work around the problem.

> incorrect on my distribution (Mandrake 10.0 Official) ? But it doesnt seem to
> be the case as the filenames are crrect for all other apps. I'm quite lost
> there ...
>
> Thanks for your explanations, at least I understand how subversion uses
> UTF-8 (even if it seems I do not understand completely ... ;-)

Well, it seems Subversion may have messed up something.

I would suggest:

1) Set LANG to fr_CH and unset LC_CTYPE.

2) Send as attachments the output of command "ls -R" with LANG set to
fc_CH and fr_CH.UTF-8. This way we can check whether there are any
(hidden?) invalid characters in the filenames.

If the filenames are correct, it could be that the filenames recorded in
the depository are uncorrectly encoded for some reason. Or that there is
a bug in filename recoding somewhere.

Dimitri

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Mon Jun 21 13:40:09 2004

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.