I expect that your problem is coming from sed's effect on binary input. As
others have mentioned, it's likely that it is not reacting well to what it
perceives as excessively long lines, or line-ending characters that are not
in the correct configuration for your platform. (This will be particularly
grim for programs (like sed) that are written in C when run on platforms
that use CR-LF for end-of-line, as the C I/O library is required to
translate EOLs into LF upon input and back into CR-LF on output.)
One possibility would be to use Gnu Emacs, which is quite robust against
these problems (provided you use find-file-literally to make sure that it
doesn't try to be intelligent about the character encoding of your file).
I've successfully used it to patch the character strings in executable
files, so it should work for you.
Another thing which might help is to not rewrite the file names. Instead,
dump and svnfilter the data to extract the files you want, then load that
into the new repository. Then use svn commands to adjust the file names.
For instance, you want to change "documentation/trunk/dir/file" into
"trunk/dir/file", so you could do "svn mv documentation/trunk trunk". That
sequence of operations would use only software that was already proven on
the tasks in question.
Dale
-----Original Message-----
From: roland@kanaha.am.mot.com [mailto:roland@kanaha.am.mot.com]On
Behalf Of Roland Besserer
Sent: Wednesday, December 29, 2004 6:52 PM
To: users@subversion.tigris.org
Subject: SVN Book Method for Splitting Repos doesn't work
Following the example on page 88, I am trying to split a repo by creating
separate repos for individual projects in the existing repo. The first
steps:
(1) dump the existing repo
(2) svndumpfilter the project you want
work as expected and I can then populate a newly create repo from the
processed dump file of step (2) above.
As the book mentions, one will typically have to modify the node entries
to 're-root' them in the new repository. In my case, I'm using sed to
convert the entries of the original dump:
Node-path: documentation/trunk/dir/file
to
Node-path: trunk/dir/file
and also remove the dump data that would create the "documentation"
directory. The resulting modified dump file appears ok and appears to
load properly (it handles revision 1, for example) until it hits the
first binary file at which point the svnadmin load command aborts with
a checksum error on that binary file.
Looking at the dump file I was surprised to see that it is not
"human readable" as the documentation claims. The binary file (in this
case a PDF) is not uuencoded (or some similar method) but included as
8-bit 'raw' data. That, of course, makes it impossible/difficult to
inspect/edit a dump file using an editor.
Still leaves me at a loss why a simple sed script like:
sed 's|^Node-path: documentation/|Node-path: |' < dump1 > dump2
which removes the leading 'documentation/' part from all node paths
would create this error on running 'svnadmin load newrepo < dump2':
started new transaction, based on original revision 1
* adding path : documentation ... done.
* adding path : branches ... done.
* adding path : tags ... done.
* adding path : trunk ... done.
* adding path : trunk/business ... done.
* adding path : trunk/design ... done.
* adding path : trunk/meetings ... done.
* adding path : trunk/presentations ... done.
* adding path : trunk/reference ... done.
* adding path : trunk/reference/DAVIC ... done.
* adding path : trunk/design/DesignBook/DesignBook.pdf ...svn: Checksum
mis
match, rep 'a':
expected: 225d1ed316bf0830dbdd6c50ff1e79e7
actual: f41751bce1f5fe359351df3a9b37be30
Has anyone seen this problem before?
Regards
roland
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Thu Dec 30 17:24:51 2004