[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: invalid character data during recursive add

From: Matt Pounsett <matt.pounsett_at_cira.ca>
Date: 2004-09-01 22:49:22 CEST

Looks like I've got this sorted out, actually. It turns out LANG=C, while
good enough to fix the charset problems in perl, is not good enough for this
case. LANG=en_US was required.

Now if only I could get the 1.4G of data to commit without the HTTP session
locking up. :) I suspect I'll wind up doing the initial commit either in
small segments, or over file:// in order to clear that up.

On Wed, 1 Sep 2004, Matt Pounsett wrote:

>
> I'm working on getting a web site under svn control, but I'm running to an
> error during import which I'm not sure how to handle. It seems to me that I'm
> not being given enough information to correct the problem, and I'm wondering
> if someone can help me get svn to be a bit more specific.
>
> For reference, I'm using svn under RedHat Edge Server 3.0, connecting to a
> remote repository over https, also running under Edge Server 3.0. I've
> changed my local LANG environment variable to 'C' to avoid conflicts since
> the files are primarily ISO-8859-1, while the default charset of the OS is
> UTF-8. (If I leave my charset as UTF-8, then I get errors on the same files as
> below, just slightly different errors).
>
> I'm at the stage where I'm doing an 'svn add *' in the trunk/ directory of my
> working copy -- setting up the initial import. The add is aborting early,
> with the following text (I'm including a few non-error lines for context):
>
> A en/forms/form2.html
> A en/forms/form3.html
> A en/pwd
> A en/home-info
> A en/q4.2003.docs
> A (bin) en/q4.2003.docs/CB Certification - Effective Date Dec 4 2003 - english - Final.doc
> A en/q4.2003.docs/CB Certification - Effective Date Dec 4 2003 - english - Final.txt
> svn: Error during recursive add of 'en/q4.2003.docs'
> svn: Can't recode string
>
> If I run this without switching away from the default UTF-8 character set, I
> get this error instead (details vary depending on the file):
>
> svn: Error during recursive add of 'en/q4.2003.docs'
> svn: Valid UTF-8 data
> (hex: 73 61 74 69 6f 6e 5f 64 65 73 5f 6d 61 72 71 75 65 73 5f 64 5f 61 67 72)
> followed by invalid UTF-8 sequence
> (hex: e9 6d 65 6e)
>
> If I do another 'svn add *' it will move past this file, and error on a
> different file. In total, I've got four or five files that generate the same
> error during an add. Some of these files have a rather sordid past, so it's
> no surprise to me that they might contain mismatched character sets and the
> like, but it would be helpful to me if svn would report more specifically what
> string is causing the error, or at least what line number it appears on.
>
> Running a subsequent 'svn add *' command appears to move past this file, but
> will die later on a different file (same error). It takes several runs of the
> add command to get through the whole directory tree. Because the subsequent
> add commands move past files that generated an error on the previous pass,
> it's unclear to me whether the files are actually being added or not.
>
> There's something else here that's confusing as well... once everything has
> been added, running 'svn status' dies with this error on one of the files that
> killed the add on pass two or three, instead of the file that killed the add
> on pass one. That might just be a matter of the 'status' command sorting
> differently than the shell is though.
>
> So for my question...
>
> Can anyone shed any light on how to get svn to be more specific about the
> errors it is encountering so that I can fix the files in question... or
> alternatively suggest another way to find the errors in the files and fix them?
> They're rather large, so eye-balling the contents isn't really an option.
>
>
>

-- 
Matt Pounsett                 CIRA - Canadian Internet Registration Authority
Technical Support Programmer                    350 Sparks Street, Suite 1110
matt.pounsett@cira.ca                                 Ottawa, Ontario, Canada
613.237.5335 ext. 231                                      http://www.cira.ca
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Wed Sep 1 22:49:48 2004

This is an archived mail posted to the Subversion Users mailing list.