[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Antwort: Re: Antwort: Re: Antwort: Re: Problem using subversion on a large repository

From: <Andreas.Otto_at_versit.de>
Date: Tue, 28 Oct 2008 10:07:07 +0100

Hi,

        the tags were created over life-time on many revisions.
        but the amount of changed files, between the revisions; is far
less as the total amount of
        files in the revision

        the problem is just that dumping everything (including all
revisions) with:

                svn dump ...

        creates a huge dump-file much larger than the repository itself.
                p.s. i run out of disk space

        as a first workaround i create a dump file of a single revision

                svn dump -r ### | gzip -9

        this creates a ~80G gzip dump file

        my goal was to filter old some old path's from the history which
are not used anymore with

                svn dump ... | svndumpfilter ... | svn load ...

        but this job never finished ... one problem is that the
transaction log from

                svn load ...

        becomes really big and the process run out of disk space

        my solution was:

                1. delet all tages from the HEAD of the initial repository
                        -> disadvantage I lost all my tags

                2.. dump just the HEAD

                        -> svn dump ... | svn load ...

                3. endup with a new repository without history but more
clean than the old

        my problem was that everything in the term of maintanance
(removing old revisions/paths ...)
        have to be done with dump/filter/load


Freundliche Grüße

Andreas Otto
ISV13 - Systemverantwortung Leben
 
Telefon 0431/603-2388
Sophienblatt 33
24114 Kiel
 
VersIT Versicherungs-Informatik GmbH
Gottlieb-Daimler-Str. 2, 68165 Mannheim
Registergericht: Mannheim HRB 6287
Vorsitzender der Geschäftsführung: Claus-Peter Gutt






kmradke_at_rockwellcollins.com
27.10.2008 16:10

An
Andreas.Otto_at_versit.de
Kopie
haisenko_at_comdasys.com, users_at_subversion.tigris.org
Thema
Re: Antwort: Re: Antwort: Re: Problem using subversion on a large
repository







Andreas.Otto_at_versit.de wrote on 10/27/2008 05:14:45 AM:
> this is from the svnbook the text from below descripes my

> experience very good
>
> one single revision dump took ~1day creating a ~80G gzip 9 dump
from
> a 7 G repository
>
> P.S. the --deltas option does not count because I took ony one
single
> revison -> no history
>
> the creat problem is creating tags with svn copy
>
>
> just to be clear one single file in different repository trees
> created by svn copy will be
> dumped for every tree with full-text
>
> in my count, 5G head tree x ~200 tags -> 1TB dump ....for one
single revison

There are only 2 ways I can see this happening:

1) You create 200 tags in the SAME revision and are dumping only that
revision
2) You are using svndumpfilter and removed the original file path, thus
causing
   all the tags to become individual real copies.

What are the exact commands you are running?
What is your repository layout?

(And yes, I have a repo with a 40G /trunk and many tags...)

Kevin R.

>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>><

> $ svnadmin create newrepos
> $ svnadmin dump oldrepos | svnadmin load newrepos
> By default, the dump file will be quite large—much larger than the
repository itself.
> That's
> because by default every version of every file is expressed as a full
text in
> the dump file.
> This is the fastest and simplest behavior, and it's nice if you're
piping thedump data
> directly into some other process (such as a compression program,
filtering program, or
> loading process). But if you're creating a dump file for longer-term
storage,
> you'll likely want
> to save disk space by using the --deltas option. With this option,
successive
> revisions of
> files will be output as compressed, binary differences—just as file
revisions
> are stored in a
> repository. This option is slower, but it results in a dump file much
closer
> in size to the
> original repository.
> We mentioned previously that svnadmin dump outputs a range of revisions.
Use the -
> -revision (-r) option to specify a single revision, or a range of
revisions,
> to dump. If you
> omit this option, all the existing repository revisions will be dumped.
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
>
> Freundliche Grüße
>
> Andreas Otto
> ISV13 - Systemverantwortung Leben
>
> Telefon 0431/603-2388
> Sophienblatt 33
> 24114 Kiel
>
> VersIT Versicherungs-Informatik GmbH
> Gottlieb-Daimler-Str. 2, 68165 Mannheim
> Registergericht: Mannheim HRB 6287
> Vorsitzender der Geschäftsführung: Claus-Peter Gutt
>
>
>
>

>
> Marc Haisenko <haisenko_at_comdasys.com>
> 26.10.2008 12:36
>
> An
>
> Andreas.Otto_at_versit.de
>
> Kopie
>
> Thema
>
> Re: Antwort: Re: Problem using subversion on a large repository
>
>
>
>
> On Friday 24 October 2008 13:37:33 Andreas.Otto_at_versit.de wrote:
> > Hi,
> >
> > thanks for the answer ... now my results ..
> >
> > -> yeah you are right Repository size is right now:
> > 7270760 K -> 7G
> >
> > I try hard to solve this Problem -> now I present my
experience:
> >
> > 1. this doesn't work:
> >
> > ->svnadmin dump -r $(svnlook youngest $OLD) $OLD |
> > svndumpfilter exclude mytags | svnadmin load $NEW
> >
> > the reason is to delete the huge "mytags" tree created
> > with "svn copy ..." with dumpfilter !after! "svnadmin dump"
> > was unuseable slow because before deleting you have
to
> > dump everything ~1TB (just a assuming) before you can filter
>
> Your assumption is wrong, the dump is a special represention of your
> repository database. You seem to assume that for example if you have one

> directory with 1GB and made 10 branches that your dump will now be 10GB
in
> size. This is not the case as a branch is only a few bytes in size.
>
> > 2. the solution was to delete the whole tags directory first
with:
> >
> > -> svn delete ....
> >
> > and than using the command above to create the new
> > repository, but i have to say:
> >
> > -> createing tags with "svn copy ...." is a design
error
> >
> > nice for small projects but not useable in a big
> > environment !!!
>
> Please describe this point in more details why you think so.
>
> Projects like GCC, KDE and Apache would certainly say that it DOES work
nice
> for huge projects and I am willing to bet money on the fact that your
> repository does not even REMOTELY reach their size (e.g. as I write
this, KDE
> is at revision 876044 and I know it's bigger than 34GB (that was their
size
> last december)). With a trunk check-out size of 2GB (AFAIK) you can see
that
> this is very efficient given the fact that they also have huge number of

> branches and tags. Have a look yourself: http://websvn.kde.org
>
> >
> > 3. I just support the "oldstyle" position that
> >
> > -> svn dump OLD | svn load NEW
> >
> > should create a repository NEW with close the same
size as
> > OLD
>
> If you update from an older SubVersion to a newer version it is even
likely
> that your new repository is SMALLER than the older due to improved
handling of
> deltas.
>
> It will be bigger with 1.5 however due to improved merge-tracking.
>
> --
> Marc Haisenko
>
> Comdasys AG
> Rüdesheimer Str. 7
> 80686 München
> Germany
>
> Tel.: +49 (0)89 548 433 321

Received on 2008-10-28 10:07:38 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.