[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: optimizing fsfs: reverse diffs?

From: Jay Berkenbilt <ejb_at_ql.org>
Date: 2005-04-10 16:00:58 CEST

Executive summary: In my experience (and measurements), forward deltas
in fsfs don't cause enough of a performance problem to worry about.
Details follow.

referring to full text on HEAD with reverse deltas in older revisions:

> All that said, it is perhaps faster at checkouts for this reason,
> and having all that transaction machinery makes it easier to do
> other indexing optimizations as well (though I don't know what else
> is actually implemented). If you want it, use it - it's been there
> since 1.0 :-)

>> You called me on it--I've done no profiling myself & have no
>> numbers, only vague accusations. :) Just wanted to see if something
>> like this has been considered. I can tell it has--thanks to you,
>> Daniel, & Mark both.

I was specifically worried about this same issue, so I did some tests.
I have some numbers which I posted several weeks ago. I did some
testing on a 1 GB repository with about 23,000 revisions converted
from CVS using cvs2svn. I don't have the exact figures in front of
me, so these are close approximations. The tests were done on an fsfs
repository on my local disk. My disk is an ATA100 (parallel ATA)
drive on a 2.4 GHz P4 with 1 GB RAM running Debian with a 2.6 kernel.

The file with the most revisions had somewhere around 150 revisions.
It took about 1.5 seconds to check out this file after freshly
unmounting and remounting the drive, and much less time after the
necessary revisions had been cached by the OS. Checking out the
entire trunk took about 14 minutes compared to about 5 minutes
checking out the entire repository from CVS. Assuming the checkout
time for CVS is comparable to straight file copying, one would expect
a factor of two because of the already-mentioned fact that there are
two copies, so this isn't so bad.

This wasn't a scientific benchmarking process, but it lead me to
conclude that, at least for repositories on the order of single-digit
numbers of gigabytes with tens of thousands of revisions where files
have a few hundred revisions, the performance differences, though
measurable, are not at all in the show-stopper category. On top of
that, even with many thousands of branches, the converted fsfs
repository was actually very slightly smaller (less than 1%) than the
original CVS repository.

So, based on my experience, the performance advantages of switching to
a HEAD with reverse diffs strategy for fsfs probably wouldn't
outweigh the advantages of old files being immutable. Keep in mind
also that if your server has a lot of RAM, having old repository files
being static will help performance a lot because often referenced
revisions will pretty much always be cached.

-- 
Jay Berkenbilt <ejb@ql.org>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sun Apr 10 16:04:00 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.