[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Splitting out project from repo

From: Bryon Winger <bryonwinger_at_gmail.com>
Date: Tue, 2 Apr 2013 14:32:05 -0700 (PDT)

I am going through a similar process myself and have some questions about
your concerns. I'm not trying to rock the boat, just looking fo clarity on
a few
points.
 
For perspective, I am working with around 300 individual projects
in a 70+ Gb repository containing over 300k revisions.

> If I understand correctly, you manually retrieve each version where
> the given path/project has changed in any way to afterwards dump those
> revisions. Why is this better/faster than using svndumpfilter with
> specifying an include path, but without the need to post process the
> dump files?

 
 
I personally don't see the advantage to waiting around for svnadmin dump

to process every unrelated revision. For one project, I am only concerned

with about 200 revisions, spread out over 210k unrelated revisions.

 

# This example took around 8 hours:

svnadmin dump /path/to/master | svndumpfilter --drop-empty-revs \
--re-number-revs include $PROJECT > $PROJECT.dump

# However, when I run this on the same project:

for rev in `svn log -r0:HEAD file:///path/to/master/$PROJECT | egrep \

"^r[0-9]+ |" | cut -d " " -f1`; do

   svnadmin dump --incremental -r ${rev:1} /path/to/master | svndumpfilter \

                                             include $PROJECT >>
$PROJECT.dump

done

 

… I can have a usable dump file in under 30 seconds. I realize this will
take

longer for larger projects, but I think it makes my point. ‘svnadmin dump’
is

still creating a full dump stream for each revision before svndumpfilter
sees

that revision to decide to keep it or not.

 

> Are you sure your approach doesn't need other paths
> from the repo, e.g. other source paths from copy operations for
> projects or stuff like that?
>
 

I absolutely agree with this checking for this. You can’t successfully pull
out

a single path using svnadmin dump / svndumpfilter if there are copies from
a

location outside of whatever you are filtering for.

 

I did notice that using svnrdump pointing to url/project seems to get

around the outside-copy-sources issue, but I think that’s another

discussion altogether.

 

> > svnadmin dump $repo --quiet -r $rev --incremental >> $project.$rev.bak
>
> Adding to revision files with >> should be impossible in your
> approach.

 
 
Are you saying that appending to an existing dump file in general is a

problem or just with all of his node-path processing? I have had no

trouble appending to existing dump files.

 

Thanks,

Bryon Winger
Received on 2013-04-02 23:40:23 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.