[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Non Ascii chars in paths cause trouble

From: Erik Huelsmann <e.huelsmann_at_gmx.net>
Date: 2004-09-21 21:42:03 CEST

> Hi Folks,

Hi Luebbe,

> A month ago, I asked a question about non ascii chars in dumpfiles,
> which caused problems when displaying them. Now I'm up the proverbial
> creek without the proverbial paddle, because I'm trying to migrate old
> repositories to a 1.1 server to be prepared when 1.1 is released.
>
> The problem is: "My German characters are scrambled"

> Source system:
> - Suse Linux 8
> - Subversion 0.32.1
> - BerkeleyDB 4.0.14
> - commits made by several clients (mostly TortoiseSVN) built against svn
> versions ranging from pre 0.32 to 1.1.0.rc3

Ok, 0.32.1 is a *very* good reason to migrate. In those days checks were not
as strict about utf-8 conformance as they are now (even though I think to
know they could be better). You may have been committing non-utf-8 paths and
log messages into your repository.
 
> Target system:
> - Suse Linux 9
> - Subversion 1.1.0RC3
> - BerkeleyDB 4.2.52
>
> On the source system:
> Attached is a snippet of the created dumpfile copied from vi. As you can
> see, the Umlaut 'ö'='oe' in the log message "Böse Welt, ob das gut geht"
> is scrambled, but the characters in Node-path: tags/Umlautname_Ä_Ö_Ü
> look 'proper'.

What locale does your vi terminal run in? does it use iso-8859-1 / -15
character encoding? (assuming you use a german locale)

If it does, then the fact that the tags directory name looks like it does
(with the accented characters) is quite alarming. They should have been
recoded to utf-8, which looks 'scrambled' on a iso-8859-xx encoded terminal.
 
> 'svnlook log -r40' displays the log message properly
> 'svnlook changed -r40' fails with the following error:
> svn: Invalid argument
> svn: failure during string recoding
> Checking out the tags fails with an 'in
>
> ---SNIP---
> Revision-number: 40
> Prop-content-length: 129
> Content-length: 129
>
> K 7
> svn:log
> V 28
> Böse Welt, ob das gut geht?
> K 10
> svn:author
> V 6
> lonken
> K 8
> svn:date
> V 27
> 2004-09-21T08:47:15.616600Z
> PROPS-END
>
> Node-path: tags/Umlautname_Ä_Ö_Ü
> Node-kind: dir
> Node-action: add
> Node-copyfrom-rev: 37
> Node-copyfrom-path: trunk
> ---SNIP---
>
> On the target system:
> When I scp this this dumpfile to the target system and load it, all the
> Umlauts are gone.
>
> 'svnlook log -r40 repository/testrepos/'
> B?\246se Welt, ob das gut geht?
> 'svnlook changed -r40 repository/testrepos/'
> A tags/Umlautname_?\196_?\214_?\220/
>
> I'm afraid that something is going terribly wrong here, that the 0.32.1
> dumpfile isn't utf-8 or something like that. How can I migrate my
> repositories?

You can dump your repositories, find the paths which are not correctly utf-8
encoded, replace the invalid characters with valid utf-8 characters and try
loading the dumpfiles. That's all I can advise you to do. It's not much, a
lot of work, but it's possible.

> Cheers & thanks
> - Lübbe

bye,

Erik.

-- 
+++ GMX DSL Premiumtarife 3 Monate gratis* + WLAN-Router 0,- EUR* +++
Clevere DSL-Nutzer wechseln jetzt zu GMX: http://www.gmx.net/de/go/dsl
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Sep 21 21:42:25 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.