[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Speeding up cvs2svn (was Re: cvs2svn takes very long time to execute (days!))

From: Tobias Ringström <tobias_at_ringstrom.mine.nu>
Date: 2004-02-15 22:19:24 CET

Roland Dreier wrote:
> Anyway, to summarize my findings on cvs2svn.py performance:
>
> 1. Use tmpfs for a working directory to avoid disk IO.
> 2. Use a LD_PRELOAD-ed library to get rid of useless sleeps in the
> BSD db library.
> 3. Increase the db cache size to avoid shuffling blocks in and out of
> cache.
> 4. It may be worth changing the pass 2 algorithm to increase
> performance.
>
> I'd be very interested to hear any reactions to these ideas.

Interesting, but for 1-3, it would be a lot faster to avoid the DB lib
completely and use in-memory python hashes. One of the things that take
a long time is the marshalling done when passing data to and from the
DB. I have some code to do this, and I hope to commit it to trunk in a
couple of days at the latest.

Please note that cvs2svn is not yet 1.0 material, and it still has
correctness bugs, i.e. the content of tags and branches can be
incorrect. I'm about to commit a new tool that looks for such errors.

/Tobias

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Feb 15 22:19:44 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.