[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

[bug report] SVN design issue: checkouts horribly space-inefficient

From: Marc Mutz <marc_at_klaralvdalens-datakonsult.se>
Date: 2005-06-23 14:54:24 CEST


I just noticed while checking out some KDE SVN modules that my inodes were
quickly depleted. After lots of searching I found that not my huge maildir
archive is the culprit, it's the .svn directory with it's props/prop-base
directories containing each one file for each object (file or dir) stored in
the repository. On a typical ext2/3 filesystem, this leads to a waste of
almost 98% of space:

$ cd kdelibs:
$ du -sh .svn/props
96K .svn/props
$ wc -c .svn/props/* | grep total
1714 total

This means, that - absent tail-end optimizations in the filesystem (which
ext2/3 doesn't have) - the space efficiency is 1.75% (1714/96K).

Same for prop-base.

Now, the amount of data this represents is
$ du -sch .svn/text-base/* | grep total
272K total
$ wc -c .svn/text-base/*|grep total
224236 total
-> 80.5% efficiency.

Now, the bad thing is that props and prop-base are identical:
$ for i in $(cd .svn/props/; echo *.svn-work ); do
  diff-u .svn/props/$i .svn/prop-base/${i/-work/-base}

So a lot of inodes could be spared by simply hard-linking the files in those
dirs. brings up the efficiency to 3.5%. Same thing could (optionally) be done
for text-base, on the assumption that most editors break that link during

That would halve the overall space inefficency.

As it is now, svn uses about 4x more inodes for the same checkout as cvs. I
expected 50% both in space and inodes, due to the offline-diff capability.

In ten years of using Unix, I've never been _near_ the inode limit, though
I've often been permanently in 90%+ disk-full mode.

I hope you can apply these simple "optimizations" in the next release. They
typically speed up diff's, too, see the difference between
 cp -ra old new-1
 cp -la old new-2
 time diff -ur old new-1
 time diff -ur old new-2


Marc Mutz -- marc@klaralvdalens-datakonsult.se, mutz@kde.org
phone: +49 521 521 45 45; mobile: +49 177 32 94 700
Klarälvdalens Datakonsult AB, Platform-independent software solutions
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Thu Jun 23 17:47:14 2005

This is an archived mail posted to the Subversion Users mailing list.