[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Space wasting

From: Adal Chiriliuc <adal_at_myrealbox.com>
Date: 2004-03-08 11:50:14 CET

I've studied the amount of space used for exporting or checking-out
the Boost library so that we can compare how much overhead SVN adds.

Exported Boost
  481 folders
  5905 files
  34 MB (theoretical)

Checked-out Boost
  5291 folders
  25544 files
  69 MB (theoretical)

As expected, the checked-out version is twice the size of the exported
version because of the keeping of base versions in .svn folders.

Now let's see the actual disk space used for the two cases.
Measurements were done on Windows XP. The "real disk size" value was
computed by substracting from the amount of free space before the
operation the amount of free space after the operation.

Two filesystems were considered. FAT32 and NTFS, both with 4K
clusters. The sizes reported for FAT32 should be identical on
different systems. The NTFS sizes can very a lot because the way this
filesystem works, but they should also be fairly identical on
different systems. As you can see, NTFS is more efficient at storing
large numbers of small folders and files (small items are stored in
the MFT along with the names and attributes so a cluster is not
wasted). An interesting fact is that the exported version is bigger
on NTFS than on FAT32.

Filesystem independent data
  Exported file count - 5905 - 100%
  Checked-out file count - 25544 - 432% <<<

  Exported folder count - 481 - 100%
  Checked-out folder count - 5291 - 1100% <<<
  
  Exported theoretical size - 34 MB - 100%
  Checked-out theoretical size - 70 MB - 205%

FAT32
  Exported real disk size - 49 MB - 100%
  Checked-out real disk size - 170 MB - 347% <<<

NTFS
  Exported real disk size - 54 MB - 100%
  Checked-out real disk size - 136 MB - 251%

On FAT32 the SVN overhead is almost 150% more than expected (347% vs.
200%) mainly because the large number of folders contained in .svn.
On NTFS the situation is better, but bear in mind that this is the
lower bound. If you would work daily in the working folder the disk
usage could start to approch the FAT32 one because of the migration of
directory data from MFT to outside it.

I'm curious what's the situation on Linux filesystems.

Why doesn't SVN use a single folder? Why does it need 9 subfolders?
Also, is the README.txt file or the empty-file really needed?

Adal Chiriliuc

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Mar 8 11:50:39 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.