[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: philosophical questions about mailer.py

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: Thu, 08 Jan 2009 11:09:32 -0500

Ben Collins-Sussman wrote:
> On Thu, Jan 8, 2009 at 9:48 AM, C. Michael Pilato <cmpilato_at_collab.net> wrote:
>
>> Here's where I think the benefit to mailer.py will come (and I say this as a
>> way of saying, "You may be onto something here, so keep investigating"):
>> mailer.py doesn't need to present it's results in a tree structure. So all
>> the work that _replay() does to take a flat list of paths and regenerate the
>> tree structure is almost immediately discarded by the consumer. That's a
>> lot of wasted processing.
>
> We're seeing some insane numbers here. This command takes only 5
> seconds to print out all 850 changed-paths:
>
> $ svn log -v http://phpcommunityorg.googlecode.com/svn -r31
>
> And yet when we run mailer.py on this revision, it takes 68 seconds to
> finish, and *60 seconds* of that is just in the replay() routine.
> Something is very wrong. We'll have to investigate a bit more. :-)

Oh, something else I thought of: the translation between C and Python has,
in my experience, proven to be very costly. And svn_repos_replay() driving
a ChangeCollector in this situation crosses that translation boundary at
least 850 times. :-)

-- 
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand
------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=1011901

Received on 2009-01-08 17:10:54 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.