[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Antwort: Re: Antwort: Re: Antwort: Re: Problem using subversion on a large repository

From: <kmradke_at_rockwellcollins.com>
Date: Tue, 28 Oct 2008 09:33:24 -0500

Andreas.Otto_at_versit.de wrote on 10/28/2008 04:07:03 AM:
>
> the tags were created over life-time on many revisions.
> but the amount of changed files, between the revisions; is far less as
the
> total amount of
> files in the revision
>
> the problem is just that dumping everything (including all revisions)
with:
>
> svn dump ...

If you want to do a full dump, use "svn dump --deltas REPOS_PATH >
out.dump".
This will create a dumpfile that should be similar sized to the REPOS_PATH
directory. I have dumped thousands of multi-GB repositories, and never
seen one larger than 3x the FSFS directory size.

As a recent example, a 5G FSFS repo had a 53G full dumpfile, but a
10G --deltas dumpfile.

> creates a huge dump-file much larger than the repository itself.
> p.s. i run out of disk space
>
> as a first workaround i create a dump file of a single revision
>
> svn dump -r ### | gzip -9
>
> this creates a ~80G gzip dump file

Using -r will only dump the amount of data representing the
files changed in that revision. It does not dump the whole repository at
the state of the revision. This size of a dumpfile does not make sense
unless you did incorrect svndumpfilter commands in the past.

> my goal was to filter old some old path's from the history which are not
used
> anymore with

Be aware, if you remove a path that had multiple tags, it will create full
revisions in each of the tags path. I'm guessing you may have done this
previously which is causing your current problems.

Dumping and filtering a repository can cause the repository size to
grow significantly if you do not do it VERY carefully...

svn obliterate is something that has been discussed in the past.
While useful for some cases, it has not yet been implemented.

svnadmin dump REPO | svndumpfilter exclude BADPATH | svnadmin load NEWREPO
does work when needed and done with specific guidelines. (Make sure
ALL tags that reference the filtered path are also filtered.)

Kevin R.

> svn dump ... | svndumpfilter ... | svn load ...
>
> but this job never finished ... one problem is that the transaction log
from
>
> svn load ...
>
> becomes really big and the process run out of disk space
>
> my solution was:
>
> 1. delet all tages from the HEAD of the initial repository
> -> disadvantage I lost all my tags
>
> 2.. dump just the HEAD
>
> -> svn dump ... | svn load ...
>
> 3. endup with a new repository without history but more clean than the
old
>
> my problem was that everything in the term of maintanance (removing old
> revisions/paths ...)
> have to be done with dump/filter/load
>
>
> Freundliche Grüße
>
> Andreas Otto
> ISV13 - Systemverantwortung Leben
>
> Telefon 0431/603-2388
> Sophienblatt 33
> 24114 Kiel
>
> VersIT Versicherungs-Informatik GmbH
> Gottlieb-Daimler-Str. 2, 68165 Mannheim
> Registergericht: Mannheim HRB 6287
> Vorsitzender der Geschäftsführung: Claus-Peter Gutt
>
>
>
> [image removed] kmradke_at_rockwellcollins.com
>

>
> kmradke_at_rockwellcollins.com
> 27.10.2008 16:10
>
> [image removed]
> An
>
> [image removed]
> Andreas.Otto_at_versit.de
>
> [image removed]
> Kopie
>
> [image removed]
> haisenko_at_comdasys.com, users_at_subversion.tigris.org
>
> [image removed]
> Thema
>
> [image removed]
> Re: Antwort: Re: Antwort: Re: Problem using subversion on a large
repository
>
> [image removed]
>
> [image removed]
>
>
>
> Andreas.Otto_at_versit.de wrote on 10/27/2008 05:14:45 AM:
> > this is from the svnbook the text from below descripes
my
> > experience very good
> >
> > one single revision dump took ~1day creating a ~80G gzip 9
dump from
> > a 7 G repository
> >
> > P.S. the --deltas option does not count because I took ony
one single
> > revison -> no history
> >
> > the creat problem is creating tags with svn copy
> >
> >
> > just to be clear one single file in different repository trees

> > created by svn copy will be
> > dumped for every tree with full-text
> >
> > in my count, 5G head tree x ~200 tags -> 1TB dump ....for one
single revison
>
> There are only 2 ways I can see this happening:
>
> 1) You create 200 tags in the SAME revision and are dumping only that
revision
> 2) You are using svndumpfilter and removed the original file path, thus
causing
> all the tags to become individual real copies.
>
> What are the exact commands you are running?
> What is your repository layout?
>
> (And yes, I have a repo with a 40G /trunk and many tags...)
>
> Kevin R.
>
> >
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><

> > $ svnadmin create newrepos
> > $ svnadmin dump oldrepos | svnadmin load newrepos
> > By default, the dump file will be quite large?much larger than the
repository itself.
> > That's
> > because by default every version of every file is expressed as a full
text in
> > the dump file.
> > This is the fastest and simplest behavior, and it's nice if you're
piping
> thedump data
> > directly into some other process (such as a compression program,
filtering
> program, or
> > loading process). But if you're creating a dump file for longer-term
storage,
> > you'll likely want
> > to save disk space by using the --deltas option. With this option,
successive
> > revisions of
> > files will be output as compressed, binary differences?just as file
revisions
> > are stored in a
> > repository. This option is slower, but it results in a dump file much
closer
> > in size to the
> > original repository.
> > We mentioned previously that svnadmin dump outputs a range of
revisions. Use the -
> > -revision (-r) option to specify a single revision, or a range of
revisions,
> > to dump. If you
> > omit this option, all the existing repository revisions will be
dumped.
> >
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> >
> > Freundliche Grüße
> >
> > Andreas Otto
> > ISV13 - Systemverantwortung Leben
> >
> > Telefon 0431/603-2388
> > Sophienblatt 33
> > 24114 Kiel
> >
> > VersIT Versicherungs-Informatik GmbH
> > Gottlieb-Daimler-Str. 2, 68165 Mannheim
> > Registergericht: Mannheim HRB 6287
> > Vorsitzender der Geschäftsführung: Claus-Peter Gutt
> >
> >
> >
> >
>
> >
> > Marc Haisenko <haisenko_at_comdasys.com>
> > 26.10.2008 12:36
> >
> > An
> >
> > Andreas.Otto_at_versit.de
> >
> > Kopie
> >
> > Thema
> >
> > Re: Antwort: Re: Problem using subversion on a large repository
> >
> >
> >
> >
> > On Friday 24 October 2008 13:37:33 Andreas.Otto_at_versit.de wrote:
> > > Hi,
> > >
> > > thanks for the answer ... now my results ..
> > >
> > > -> yeah you are right Repository size is right now:
> > > 7270760 K -> 7G
> > >
> > > I try hard to solve this Problem -> now I present my
experience:
> > >
> > > 1. this doesn't work:
> > >
> > > ->svnadmin dump -r $(svnlook youngest $OLD) $OLD |
> > > svndumpfilter exclude mytags | svnadmin load $NEW
> > >
> > > the reason is to delete the huge "mytags" tree
created
> > > with "svn copy ..." with dumpfilter !after! "svnadmin dump"
> > > was unuseable slow because before deleting you have
to
> > > dump everything ~1TB (just a assuming) before you can filter
> >
> > Your assumption is wrong, the dump is a special represention of your
> > repository database. You seem to assume that for example if you have
one
> > directory with 1GB and made 10 branches that your dump will now be
10GB in
> > size. This is not the case as a branch is only a few bytes in size.
> >
> > > 2. the solution was to delete the whole tags directory first
with:
> > >
> > > -> svn delete ....
> > >
> > > and than using the command above to create the new
> > > repository, but i have to say:
> > >
> > > -> createing tags with "svn copy ...." is a design
error
> > >
> > > nice for small projects but not useable in a big
> > > environment !!!
> >
> > Please describe this point in more details why you think so.
> >
> > Projects like GCC, KDE and Apache would certainly say that it DOES
work nice
> > for huge projects and I am willing to bet money on the fact that your
> > repository does not even REMOTELY reach their size (e.g. as I write
this, KDE
> > is at revision 876044 and I know it's bigger than 34GB (that was their
size
> > last december)). With a trunk check-out size of 2GB (AFAIK) you can
see that
> > this is very efficient given the fact that they also have huge number
of
> > branches and tags. Have a look yourself: http://websvn.kde.org
> >
> > >
> > > 3. I just support the "oldstyle" position that
> > >
> > > -> svn dump OLD | svn load NEW
> > >
> > > should create a repository NEW with close the same
size as
> > > OLD
> >
> > If you update from an older SubVersion to a newer version it is even
likely
> > that your new repository is SMALLER than the older due to improved
handling of
> > deltas.
> >
> > It will be bigger with 1.5 however due to improved merge-tracking.
> >
> > --
> > Marc Haisenko
> >
> > Comdasys AG
> > Rüdesheimer Str. 7
> > 80686 München
> > Germany
> >
> > Tel.: +49 (0)89 548 433 321
Received on 2008-10-28 15:34:06 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.