[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: The cost of fulltexts on branches

From: Tobias Ringström <tobias_at_ringstrom.mine.nu>
Date: 2004-06-30 10:09:08 CEST

Branko Čibej wrote:

> It looks like our text storage policy is a bit naïve, then. In this
> case, trunk has only 8% of the fulltexts (and only 4% of the total
> size) in the whole repository. That's really horrible. Of course, GCC
> is not your run-of-the-mill open source project, but getting GCC to
> adopt Subversion would be nice, and it won't happen it we have this
> kind of overhead. There are other interesting projects out there that
> probably have similar complexity.
>
> I wonder if we could introduce some sort of total version ordering
> within a node, so that we could have _one_ fulltext per node (we can,
> of course, but it's not obvious that this is easy to do in 1.x). These
> are all BDB-specific musings, of course; I doubt FSFS would scale well
> to repositories of this size, except for the more size-efficient text
> storage, of course.

The conversion to fsfs completed during the night, and it became 3.1 GiB
in total, i.e. 2.5 GiB less than the BDB repos which was 5.6 GiB.

The scalability question is a hard question of course, but exporting
trunk took 3m56s real time with fsfs and 6m21s with bdb. The fsfs export
used 1m28s CPU, and the bdb export used 0m47s CPU. This is on a 3.2 GHz
P4-HT with 1 GiB RAM and a modern but "normal" 7200 RPM SATA disk. The
export was performed over file://.

> Thanks for doing this analysis. It's exactly the sort of data point we
> need. Out of interest, do you have any idea how many of those
> fulltexts are directory representations? I suspect it could be a
> significant amount.

My pleasure. The numbers in the table are only file representation
fulltexts. I know that cvs2svn still performs more copies than necessary
which creates an unusually large amount of directory representations, so
I did not want to include those. I'll rerun the test once cvs2svn's last
few shortcomings are taken care of.

/Tobias

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Jun 30 10:10:42 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.