Roland Dreier wrote:
> Anyway, to summarize my findings on cvs2svn.py performance:
>
> 1. Use tmpfs for a working directory to avoid disk IO.
> 2. Use a LD_PRELOAD-ed library to get rid of useless sleeps in the
> BSD db library.
> 3. Increase the db cache size to avoid shuffling blocks in and out of
> cache.
> 4. It may be worth changing the pass 2 algorithm to increase
> performance.
>
> I'd be very interested to hear any reactions to these ideas.
Interesting, but for 1-3, it would be a lot faster to avoid the DB lib
completely and use in-memory python hashes. One of the things that take
a long time is the marshalling done when passing data to and from the
DB. I have some code to do this, and I hope to commit it to trunk in a
couple of days at the latest.
Please note that cvs2svn is not yet 1.0 material, and it still has
correctness bugs, i.e. the content of tags and branches can be
incorrect. I'm about to commit a new tool that looks for such errors.
/Tobias
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Feb 15 22:19:44 2004