[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn_repos_get_logs3 and unbounded memory use

From: Garrett Rooney <rooneg_at_electricjellyfish.net>
Date: 2006-02-09 02:09:15 CET

On 2/8/06, Garrett Rooney <rooneg@electricjellyfish.net> wrote:
> On 2/8/06, C. Michael Pilato <cmpilato@collab.net> wrote:
> > Hrm. So, the additional calls to svn_fs_node_history() simply populate
> > a structure. But you'd have to call svn_fs_history_prev() twice, once
> > to get to the history location you reported in your last iteration, and
> > then again to reach the next new and interesting location. That could
> > really get costly in terms of performance, I imagine. Especially over
> > the likes of 1200 paths. Still, memory is a fixed resource; Time not so
> > much.
>
> The worse part is that because we need to check ALL the histories each
> time through the loop, we are stuck doing that open/prev/prev dance N
> revs x M paths we can't keep open all the time. So as soon as you go
> over this magical limit you'll hit a wall and performance goes to
> hell. Unless I'm missing something here...
>

So, just to see what the impact would be like, I implemented this the
"easy" way, not keeping any history objects open at all, just doing
the open/prev/prev thing each time we need to move back. On my test
case (which runs log on about 1200 paths in a repository that's got
867 revisions in it) this results in about a 20% speed hit, with max
memory usage (via the oh-so-scientific "look at the output of top"
method of measuring) capping out at around 3 megs, as opposed to about
250 megs the old way of doing things.

I'm not sure if this is the kind of approach we want to taks (it would
be really nice if a call with a more reasonable number of paths could
go faster by keeping the histories open, for example), but I imagine
we're going to need to go down this path at some point unless we want
to put a hard cap on the total number of paths you're allowed to pass
to log.

Thoughts?

-garrett

[[[
Switch svn_repos_get_logs3 to a far less memory intensive implementation.
Instead of keeping each history object open all the time we now open them
as needed, which burns more CPU, but keeps us from using all available
memory.

* subversion/libsvn_repos/log.c
  (path_info): Remove the hist, newpool, and oldpool members, make path
   into a stringbuf and add two booleans, done and first_time.
  (get_history): Take a pool. Open the history object each time through
   and advance it to the proper location.
  (check_history): Take a pool, update for changes to get_history.
  (next_history_rev): Account for new way to tell if we're done.
  (svn_repos_get_logs3): Handle changes to path_info and get_history.
]]]

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Received on Thu Feb 9 02:10:21 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.