Ben Collins-Sussman wrote:
> On Thu, Jan 8, 2009 at 9:48 AM, C. Michael Pilato <cmpilato_at_collab.net> wrote:
>
>> Here's where I think the benefit to mailer.py will come (and I say this as a
>> way of saying, "You may be onto something here, so keep investigating"):
>> mailer.py doesn't need to present it's results in a tree structure. So all
>> the work that _replay() does to take a flat list of paths and regenerate the
>> tree structure is almost immediately discarded by the consumer. That's a
>> lot of wasted processing.
>
> We're seeing some insane numbers here. This command takes only 5
> seconds to print out all 850 changed-paths:
>
> $ svn log -v http://phpcommunityorg.googlecode.com/svn -r31
>
> And yet when we run mailer.py on this revision, it takes 68 seconds to
> finish, and *60 seconds* of that is just in the replay() routine.
> Something is very wrong. We'll have to investigate a bit more. :-)
Oh, something else I thought of: the translation between C and Python has,
in my experience, proven to be very costly. And svn_repos_replay() driving
a ChangeCollector in this situation crosses that translation boundary at
least 850 times. :-)
--
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet <> www.collab.net <> Distributed Development On Demand
------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1011901
Received on 2009-01-08 17:10:54 CET