[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: issue 1463: locale problems during import and export/checkout

From: Michael Wood <mwood_at_its.uct.ac.za>
Date: 2003-08-06 17:52:29 CEST

On Wed, Aug 06, 2003 at 10:07:48AM +0200, Stephan Hermann wrote:
> HI SVN Users :)
>
> as Karl send me a comment on this issue, that I should describe you the
> problem here on this list.
>
> Ok, lets go:
>
> You have to take a file tree with binary and/or text data just like this:
>
> /content/bla/fasel/großbritannien.html
> /content/fasel/bla/Für alle Zeiten_trailer.mov
>
> As you can see, I'm using german umlaute (ß == &szlig; and ü == &uuml;).
>
> My OS is Debian Linux Woody with all security and bugfix packages applied.
> I'm using Berkeley DB 4.1.25 or 4.0.14 as reference implementation.
> Also I'm using apache 2.0.45/2.0.46 as a webdav server.
>
> Neon, Openssl, etc. are always the latest stable releases.
>
> ok, my system locale is POSIX (export LANG=POSIX), this locale is set up from
> the installation server of our server farm, so normally no harm.
>
> ok now do the following:
>
> svnadmin create /data/repos/inbox
>
> svn import /content file://data/repos/inbox
>
> When you are trying to import now the files into the repository, this action
> will abort with the failure: failure during string recode (utf.c:173)
> (libsvn_subr).
> After this action, the DB is completly broken.

I suspect "svnadmin recover" will fix it.

> When I change the locale to "export LANG=de_DE@euro" and do the import again
> everything works fine.

This is because umlauts etc. are not valid in the POSIX locale.

> Now the other way around, you import the filetree with a correct locale
> setting, and after this, you reset the locale to "POSIX" or to another locale
> != iso-8859-xx, and try to export or checkout (i did a checkout in this
> test).
>
> If you reach the first file with the special char (german umlaut), the
> checkout will abort with the same error message I wrote, at the same line in
> the sourcecode.
>
> What's the problem with it:
>
> I have to serve different data repositories for diff. countries.
> All countries have their own locale setting, but I can't change the locale
> everytime, just because all repositories are laying on a HA cluster.
>
> So, if I have a german user with a german locale, and he wants to
> checkout a repository which was imported with e.g. a russian locale
> (kyrillic charset), this action will abort and during adding and
> importing files it will destroy the berkeley db behind the repository.
[snip]

But the locale is a client side thing, not a server side thing. If this
causes the repository to need a recovery, I think that is a bug.

User A has locale set to de_DE@euro and User B has their locale set to
some cyrillic locale or something. These filenames are translated to
UTF8 internally, so the repository never knows or cares about the
clients' locales. If the German user tries to check out filenames with
Russian special characters in them, then he's going to have trouble, but
maybe setting the locale to something like de_DE.UTF-8 would work?

I am by no means an expert on this sort of thing, though...

-- 
Michael Wood <mwood@its.uct.ac.za>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Wed Aug 6 17:53:20 2003

This is an archived mail posted to the Subversion Users mailing list.