[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Speeding up cvs2svn (was Re: cvs2svn takes very long time to execute (days!))

From: Tobias Ringström <tobias_at_ringstrom.mine.nu>
Date: 2004-02-15 23:16:12 CET

kfogel@collab.net wrote:
> Oh-ho! The marshalling is more time-consuming than I thought, then.
> (I hadn't profiled it yet, since correctness issues are a higher
> priority to me right now, but still I admit this is surprising.)

For my test case it took the majority of the time. If you have the
time, please try the new profile-cvs2svn.py script and view the data in
kcachegrind (which btw was very easy to build from source). It's quite
illuminating.

> A thought: we could cut down the marshalling quite a bit, by making
> marshal/unmarshal behavior an optional flag to the Database class, and
> passing it as false for those database which use only Python strings
> as keys/values. The Database could still test keys/values for sanity
> before using them, assuming type tests are still cheap (!).
>
> I feel funny about using in-memory hashes. cvs2svn.py should scale
> well by default. Do you plan to automagically switch to a disk
> database if the hash count exceeds a certain magic number?

I'm thinking of adding an --in-memory option. The question of what
should be the default is interesting. If we can make the DB-version
faster, the question is moot, of course. As it stands right now, I'd
personally prefer to make in-memory be default, and add a --huge-repos
options for slow on-disk processing. I wouldn't dare to do that without
support on the dev list of course.

>>Please note that cvs2svn is not yet 1.0 material, and it still has
>>correctness bugs, i.e. the content of tags and branches can be
>>incorrect. I'm about to commit a new tool that looks for such errors.
>
> You, sir, rock my socks.

I try to. :-)

The script is a bit raw at the moment, but I hope to work more on it
soon, and I invite others to do so as well. I added it to the repos so
that the rest of you can use it to hunt for correctness bugs.

/Tobias

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Feb 15 23:16:40 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.