[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Reality of delta benefit

From: Bill Tutt <billtut_at_microsoft.com>
Date: 2000-08-17 18:27:04 CEST

It's not all about the storage requirement, it's also about the speed it
takes to perform source code control operations.
i.e.:
* Initial insertion
* Checking in a new version of a file
* Retrieving the current version (in a reverse delta scenario)
* Retrieving an intermediate version
* Computing the diff between two distinct versions
* Reporting on revision logs (cvs log)
* Merging branches

Since source code control computes diff for breakfast, lunch, and dinner,
this should make some of the operations users normally perform faster than
they might otherwise be.

The interesting thing about XDFS is that you could layer an optimized
caching policy on top of XDFS that created and cached frequently computed
delta operations.

Bill

 -----Original Message-----
From: Jonathan S. Shapiro [mailto:shap@eros-os.org]

In case this is useful to the delta storage debate...

The other day, in the process of figuring out how big a farm to buy, I ran a
size analysis on the EROS source repository. Basically, I expanded every
version ever and examined the disk consumption in blocks under various
compression schemes: RCS, gzip, bzip2, etc.

Somewhat to my surprise, RCS is only very slightly better than gzip (normal
compression -- nothing fancy) on this data set and roughly the same as
bzip2.

Is this comparable to other measurements people have seen? If so, it strikes
me that storing deltas in the repository itself may multiply I/O's and file
operations for very little benefit.

shap
Received on Sat Oct 21 14:36:07 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.