Matt,
Each revision (I verified the commits are only single files) is
generating
a file in db/revs containing the delta for the file that was changed.
It
also contains data that looks like this:
K 13
filename1.xml
V 20
file zy.0.r7854/1078
K 13
filename2.xml
V 19
file 100.0.r7873/38
K 13
filename3.xml
V 20
file 2la.0.r18919/41
K 13
filename4.xml
V 22
file 1fk.0.r11244/1372
With those 4 lines of data for every file and directory in the
repository
(15000+) you can see how we're getting 700k per commit, despite only
committing a 2-3k file delta.
I found this doc on the fsfs file formats:
http://svn.collab.net/repos/svn/trunk/subversion/libsvn_fs_fs/structure
________________________________
From: Matt Doran [mailto:matt.doran@papercut.biz]
Sent: Saturday, January 28, 2006 7:22 PM
To: Dan White
Cc: users@subversion.tigris.org
Subject: Re: application ill-suited for svn?
Hi Dan,
Even though Subversion has global revision numbers, it only stores the
diffs of files that have changed in the commit (plus any meta-data that
goes along with the commit ... like commit messages, etc). With FSFS
things are slightly more complex than that, it uses a clever technique
called skip-deltas to optimize accessing recent revisions, without
having to have write permissions to previous revisions when committing.
However, there is a slight size penalty with this approach. You can
read about this here:
http://svn.collab.net/repos/svn/trunk/notes/skip-deltas. My
understanding is that because of this, repositories using the BDB
backend will be a little smaller than FSFS, but BDB has some other
trade-offs.
I'm surprised that each commit is adding 700K. It doesn't sound right,
but an svn dev might be able to add more to the discussion. In my
experience SVN is more space efficient than CVS.
Is there a possibility that you committing more than you think in each
commit? i.e. if you run "svn log -v" on recent revisions ... are
more changed files listed than you would expect?
Cheers,
Matt
Dan White wrote:
Sorry, I was wrong about the number of files. We actually have
about
13k files. Each commit now is adding almost 700k to /dv/revs.
If we
were versioning only at the file level (which is all we really
require),
each commit should only use at most the size of the files being
updated,
plus any commit comments. This is one reason I'd consider going
back to
cvs for this particular repository. I believe we'd have to do a
significant amount of work to recode our app to interface with
cvs.
As far as pruning older revisions goes, that's possible for
roughly 25%
of the files in the repository. I rebuilt the repository with
the
historical revisions for the other 75%, then imported only the
most
recent version of the 25% in a single mass commit. This
decreased the
repository size to about 11gb.
-----Original Message-----
From: Kevin Greiner [mailto:greinerk@gmail.com]
Sent: Friday, January 27, 2006 5:28 PM
To: Dan White
Cc: users@subversion.tigris.org
Subject: Re: application ill-suited for svn?
On 1/25/06, Dan White <dwhite@clubmom.com>
<mailto:dwhite@clubmom.com> wrote:
Unfortunately we didn't consider how repository
level versioning (which has no benefit in this
application) would
inflate the db size. Some 6000 files, each only a few
kb in size, and
62000 revisions later, our /db/revs dir is about 19gb in
size and
growing almost 1gb daily. We never have multiple file
commits.
This is the first I've heard about reposity level versioning
inflating
the db size. Could you elaborate? If I did my math right
(19,000,000kb
/ 62,000 revs) you're averaging about 300kb per commit. That
does seem
high to me. And if you're growing at 1gb/day that means you're
getting
roughly 3,300 commits/day. That sound about right?
I'm wondering if you could remove older revisions periodically?
The
dump file outputs revisions in date order but I don't know if
you
could chop off, say, the first 20,000 revisions without borking
the
resulting loaded repo or not. Anyone tried this?
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail:
users-help@subversion.tigris.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sun Jan 29 11:10:44 2006