RE: Splitting out project from repo

From: Bert Huijben <bert_at_qqmail.nl>
Date: Wed, 3 Apr 2013 11:39:26 +0200

Hi,

The ‘svnrdump’ tool that was added in Subversion 1.7 might do exactly what you to do.

This tool allows creating a dumpfile from a url (E.g. file:///path/to/repos <file:///\\path\to\repos> ) and should skip unrelated paths for you during the repository processing.

You probably still want the svndumpfilter processing to drop empty revisions before loading it in a new repository.

Bert

From: Bryon Winger [mailto:bryonwinger_at_gmail.com]
Sent: dinsdag 2 april 2013 23:32
To: subversion_users_at_googlegroups.com
Cc: users_at_subversion.apache.org; tschoening_at_am-soft.de
Subject: Re: Splitting out project from repo

I am going through a similar process myself and have some questions about

your concerns. I'm not trying to rock the boat, just looking fo clarity on a few

points.

For perspective, I am working with around 300 individual projects

in a 70+ Gb repository containing over 300k revisions.

If I understand correctly, you manually retrieve each version where
the given path/project has changed in any way to afterwards dump those
revisions. Why is this better/faster than using svndumpfilter with
specifying an include path, but without the need to post process the
dump files?

I personally don't see the advantage to waiting around for svnadmin dump

to process every unrelated revision. For one project, I am only concerned

with about 200 revisions, spread out over 210k unrelated revisions.

# This example took around 8 hours:

svnadmin dump /path/to/master | svndumpfilter --drop-empty-revs \
--re-number-revs include $PROJECT > $PROJECT.dump

# However, when I run this on the same project:

for rev in `svn log -r0:HEAD file:///path/to/master/$PROJECT <file:///\\path\to\master\$PROJECT> | egrep \

"^r[0-9]+ |" | cut -d " " -f1`; do

svnadmin dump --incremental -r ${rev:1} /path/to/master | svndumpfilter \

include $PROJECT >> $PROJECT.dump

done

… I can have a usable dump file in under 30 seconds. I realize this will take

longer for larger projects, but I think it makes my point. ‘svnadmin dump’ is

still creating a full dump stream for each revision before svndumpfilter sees

that revision to decide to keep it or not.

Are you sure your approach doesn't need other paths
from the repo, e.g. other source paths from copy operations for
projects or stuff like that?

I absolutely agree with this checking for this. You can’t successfully pull out

a single path using svnadmin dump / svndumpfilter if there are copies from a

location outside of whatever you are filtering for.

I did notice that using svnrdump pointing to url/project seems to get

around the outside-copy-sources issue, but I think that’s another

discussion altogether.

> svnadmin dump $repo --quiet -r $rev --incremental >> $project.$rev.bak

Adding to revision files with >> should be impossible in your
approach.

Are you saying that appending to an existing dump file in general is a

problem or just with all of his node-path processing? I have had no

trouble appending to existing dump files.

Thanks,

Bryon Winger
Received on 2013-04-03 11:40:17 CEST

This message: [ Message body ]
Next message: Richard Cavell: "Can't svnsync due to pre-revprop-change hook"
Previous message: Thorsten Schöning: "Re: Splitting out project from repo"
In reply to: Bryon Winger: "Re: Splitting out project from repo"
Next in thread: Bryon Winger: "Re: Splitting out project from repo"
Reply: Bryon Winger: "Re: Splitting out project from repo"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]