[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Node origins cache rewrite

From: Mark Phippard <markphip_at_gmail.com>
Date: Thu, 24 Jan 2008 21:54:41 -0500

I see David has rewritten this to no longer use SQLite. Yay!

That being said, I do still have some reservations. Keep in mind that
CollabNet uses BDB repositories, so I am just speaking from what we
have heard in the past from users.

How many nodes will a large repository have? We have heard from users
with working copies with thousands of folders and tens and hundreds of
thousands of files. If this represents their trunk, and the have many
branches with modifications how many nodes can they expect.

As I said previously, just 100,000 nodes X 4kb block size is 400 MB of
disk space used. Don't we think users might complain about the
increase? Even if the repository is already 4 GB, I am sure they
would still notice the increase.

Does the Python script to generate the cache still work? I wonder if
we could modify it or otherwise make it available for people to run on
some repositories to get an idea of the number of nodes in their
repository. It would be interesting to see how many nodes are in the
ASF repository. Perhaps we could run it on some of our large
repositories at CollabNet as well.

That being said, I suppose we should only do this if there are a
number of nodes at which point we would want to consider changing

When we came up with this design, how many nodes were we thinking
might typically exist? What is it optimized for?

Lots of questions, sorry. Glad to see the progress being made towards
the 1.5 branch though.

Mark Phippard
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-01-25 03:56:17 CET

This is an archived mail posted to the Subversion Dev mailing list.