[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: character encoding on import

From: Kalin KOZHUHAROV <kalin_at_thinrope.net>
Date: 2006-05-06 04:44:36 CEST

Jamie wrote:
> I'm trying to import a project for the first time into subversion, the
> project is 2.5gb and has thousands of files. Some of the file names in
> the project contain non utf-8 characters. I'm getting errors when I try
> to import such as the following:
>
> svn: Valid UTF-8 data
> (hex: 72 6f 6d 61 6e 20 26 20 6a)
> followed by invalid UTF-8 sequence
> (hex: 9a 72 6e 20)
>
> The system is RHEL 4, is there any way to either, convert all filenames
> to valid utf-8, or make subversion import without errors? Its taking a
> very long time to figure out where the invalid filenames are as its
> taking a long time to import such a large amount of data only to have an
> error crop up and then have to start all over again once its been fixed.

Have you tried convmv [1]?

Do you have your locale set up correctly? What does `locale` give?

One dumb method is to do:
 find /your/start/dir >/tmp/list
 cat /tmp/list|iconv -t UTF-8
and look at where does it error.

[1] http://j3e.de/linux/convmv/

Kalin.

-- 
|[ ~~~~~~~~~~~~~~~~~~~~~~ ]|
+-> http://ThinRope.net/ <-+
|[ ______________________ ]|
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sat May 6 04:45:52 2006

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.