On Tue, May 25, 2004 at 11:11:43AM -0500, kfogel@collab.net wrote:
> Greg Hudson <ghudson@MIT.EDU> writes:
> > * If you have many small repositories (e.g. you're a massive hosting
> > site and many of the projects you host never get off the ground),
>
> Ahem, not to mention any names... <cough>, <cough>... :-)
>
> > savings may be better because FSFS has less overhead.
>
> If a repository has 30,000 revisions, how does FSFS do? Is there a
> directory somewhere that has 30,000 entries (in the regular
> filesystem, I mean, not the versioned filesystem, of course)?
>
> Or is there some sort of subdir'ing that changes the potential O(N)
> problem here to O(log(N)) instead?
>
One idea that I was considering (but don't have any operational experience
with) is:
base/holds revs 0-999
base/1000/holds revs 1000-1999
base/2000/holds revs 2000-1999
...
base/1000000/holds revs 1000000 - 1000999
base/1000000/1000/holds revs 1001000-1001999
base/1000000/2000/holds revs 1002000-1002999
...
This scheme wastes no directories for the first 1000 revs,
and very close to only 1 directory per thousand revs after
that. Most repositories will never get to 1000000 revs.
The maximum depth would be log1000(revcount).
No difficult-to-do-in-one's-head hashing is required to
compute the location of a revision.
--ben
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue May 25 23:52:31 2004