Re: Help with very large repositories

From: Ben Collins-Sussman <sussman_at_collab.net>
Date: 2005-08-27 20:46:00 CEST

On Aug 27, 2005, at 1:09 PM, Michael Muller wrote:

>
> Ben Collins-Sussman wrote:
>
>> This is a cvs2svn scalability question. You need to resend this mail
>> to the cvs2svn users@ list, and also say what version of cvs2svn
>> you're using.
>>
>
> No, I'm not concerned about the scalability of cvs2svn. As I said,
> if I
> produce a dumpfile (something like "cvs2svn --dump-only --dumpfile
> repo.dump
> /usr/local/cvsroot") it only takes about half an hour. Loading it
> with
> "svnadmin load" appears to take longer and longer for each revision.

Ah, sorry.

Part of the problem is that cvs2svn generates a huge number of
branches and tags -- especially tags. So you've got lots of
revisions that keep creating entries under the /tags directory.

On top of that, there's definitely a db-schema inefficiency in our
BerkeleyDB repository implementation. We store a directory's
children in a single lisp-like s-expression: (child1 child2
child3 ...). When you add a new child to a directory in one commit,
we have to write out the entire expression again. If N is the number
of children in a single directory to begin with, then it takes O(N)
time to add a new child in the one commit.

The FSFS repository implementation still has a similar O(N) problem,
though not quite as bad. (One of our developers, ghudson, speculates
it's only different from the BerkeleyDB implementation by a constant
factor.)

So, what you're experiencing is a combination of two problems:

1. cvs2svn creating huge directories from your huge codebase and
history,

2. a less-than-ideal db schema, one not best suited for
directories with huge numbers of children.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sat Aug 27 20:47:25 2005

This message: [ Message body ]
Next message: Michael Muller: "Re: Help with very large repositories"
Previous message: Daniel Berlin: "Re: Help with very large repositories"
In reply to: Michael Muller: "Re: Help with very large repositories"
Next in thread: Michael Muller: "Re: Help with very large repositories"
Reply: Michael Muller: "Re: Help with very large repositories"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]