[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Dump file bloating not observed (was How big can a repository get?)

From: B Smith-Mannschott <bsmith.occs_at_gmail.com>
Date: Wed, 3 Dec 2008 12:34:46 +0100

*oops, a copy for the list...*

---------- Forwarded message ----------
From: B Smith-Mannschott <bsmith.occs_at_gmail.com>
Date: Wed, Dec 3, 2008 at 12:32 PM
Subject: Re: Re: Antwort: How big can a repository get?
To: Andreas.Otto_at_versit.de

On Wed, Dec 3, 2008 at 11:05 AM, <Andreas.Otto_at_versit.de> wrote:

>
> Hi,
>
> I want to stop this thread because the issue is clear
>
> everyone not able to see this issue I would recomment the following
> test
>
>
> 1. create an empty rep
> 2. fill it with 100M Data using multiple files
> 3. do a dump
> 4. make tags with svn copy using the whole rep
> 5. do a dump again
>
> -> you will see that the size differ
>
> Freundliche Grüße
>

Whatever you are doing, you're not describing it accurately. I wrote a
script to reproduce your points above as best I could. I ignored
"using the whole rep" in your fourth point because I have no idea what
that's supposed to mean.

*I observed no significant difference.*

$ svn --version | head -n 1
svn, version 1.5.2 (r32768)

*Here are the results:*

# fill the trunk
# note the size of the wc/trunk, exported
253M wc/trunk
112M wc/trunk.exported
# preserving otto before creating tags
# creating tags for revisions 2 through 42
# preserving otto after creating tags
# note the additional revisions in otto.after
otto.before 42
otto.after 83
# generating dump files
# sizes
15M otto.after
17M otto.after.deltas.dump
97M otto.after.dump
14M otto.before
17M otto.before.deltas.dump
97M otto.before.dump

As you can no doubt see, there's no significant difference in the
sizes of the dump files between before and after.

Using deltas is helping us here not because we've made edits to file
contents (we haven't) but because deltas are also compressed and the
text files I'm using are easy to compress.

There is some difference in the repository sizes, though this is to be
expected given the fact that FSFS genrates two files on my stock EXT3
file system (4KB blocksize) for each revision.

*This is the script I used:*

#!/bin/bash
svnadmin create otto
url=file://$PWD/otto
svn -q co $url wc
svn -q mkdir wc/{tags,trunk,branches}
svn -q ci wc -m "added empty tags, trunk, branches"
echo "# fill the trunk"
n=1
while (( $(du -sm wc/trunk|cut -f1) < 250 ))
do # we keep going until the wc of the trunk is 250 MB, which
    # will get us more than 100 MB of content, even with the
    # overhead of .svn directories
    cp -a about-2.8-megs-of-text-files wc/trunk/$n
    svn -q add wc/trunk/$n
    svn -q commit wc -m "added directory $n to trunk"
    n=$((n + 1))
done
echo "# note the size of the wc/trunk, exported"
svn -q export wc/trunk wc/trunk.exported
du -sh wc/trunk wc/trunk.exported
rm -rf wc
echo "# preserving otto before creating tags"
cp -a otto otto.before
echo "# creating tags for revisions 2 through $n"
while (( n > 1 ))
do
    svn -q cp $url/trunk@$n $url/tags/$n -m "creating tag $n of trunk@$n"
    n=$((n - 1))
done
echo "# preserving otto after creating tags"
mv otto otto.after
echo "# note the additional revisions in otto.after"
echo otto.before $(svnlook youngest otto.before)
echo otto.after $(svnlook youngest otto.after)
echo "# generating dump files"
svnadmin -q dump otto.before > otto.before.dump
svnadmin -q dump --deltas otto.before > otto.before.deltas.dump
svnadmin -q dump otto.after > otto.after.dump
svnadmin -q dump --deltas otto.after > otto.after.deltas.dump
echo "# sizes"
du -sh otto.[ab]*

// Ben Smith-Mannschott

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=1065&dsMessageId=978828

To unsubscribe from this discussion, e-mail: [users-unsubscribe_at_subversion.tigris.org].
Received on 2008-12-03 12:35:42 CET

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.