[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: cvs2svn takes very long time to execute (days!)

From: Daniel Berlin <dberlin_at_dberlin.org>
Date: 2004-02-15 06:17:31 CET

>
>> Any hints on what takes that much time? (I profiled the script and
>> most of the time was to execute 'co' and in enroot_names())
>>
>> What can I do to make things smoother?
>
> Rewrite the thing to directly parse RCS files? I'm not entirely sure,
> but that might help...
>

No, it won't, unless you parse them using a python C module.

We used to do it in pure python co, and it was several orders of
magnitude slower, which is why switched to "co".

There are two reasons why doing it directly in python isn't faster (and
one why it will never be faster without bypassing some python niceness)
.
1. CO does diff composition, the old pure python RCS diff handler
didn't. This may or may not matter, you can probably deal with this in
various ways.

The real killer is
2. Strings in python are immutable. This means any change requires
copies. Even if you solved the composition problem, just the lines
applying the edits was responsible for 99% of the time of the script
(in profiling) due to string copying.

I once started the changes to tparse (this a C++ python module that
viewcvs uses to parse rcs files) so that it could do the revision
generation.
I never finished, and probably won't.
No work has been done on tparse since then, AFAICT. Latest viewcvs repo
shows:

  tparsemodule.cpp 1.6 23 months lbruand Memory leaks bugfix, by D.
Berlin

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sun Feb 15 06:18:04 2004

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.