> -----Original Message-----
> From: C. Michael Pilato [mailto:cmpilato_at_collab.net]
> Sent: donderdag 8 januari 2009 17:10
> To: Ben Collins-Sussman
> Cc: dev_at_subversion.tigris.org
> Subject: Re: philosophical questions about mailer.py
> Ben Collins-Sussman wrote:
> > On Thu, Jan 8, 2009 at 9:48 AM, C. Michael Pilato
> <cmpilato_at_collab.net> wrote:
> >> Here's where I think the benefit to mailer.py will come (and I say
> this as a
> >> way of saying, "You may be onto something here, so keep
> >> mailer.py doesn't need to present it's results in a tree structure.
> So all
> >> the work that _replay() does to take a flat list of paths and
> regenerate the
> >> tree structure is almost immediately discarded by the consumer.
> That's a
> >> lot of wasted processing.
> > We're seeing some insane numbers here. This command takes only 5
> > seconds to print out all 850 changed-paths:
> > $ svn log -v http://phpcommunityorg.googlecode.com/svn -r31
> > And yet when we run mailer.py on this revision, it takes 68 seconds
> > finish, and *60 seconds* of that is just in the replay() routine.
> > Something is very wrong. We'll have to investigate a bit more. :-)
> Oh, something else I thought of: the translation between C and Python
> in my experience, proven to be very costly. And svn_repos_replay()
> a ChangeCollector in this situation crosses that translation boundary
> least 850 times. :-)
For all changed files in the editor drive: open_file/add_file, at least one
action (property change or content change), and then a close_file. And on
top of that some extra calls for the directory levels.. So I would guess at
least 2500 translations between C and Python for this specific example.
(I just wrapped the delta editor in SharpSvn)
Received on 2009-01-08 18:59:18 CET