[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn_repos_get_logs3 and unbounded memory use

From: Garrett Rooney <rooneg_at_electricjellyfish.net>
Date: 2006-02-08 23:47:24 CET

On 2/8/06, C. Michael Pilato <cmpilato@collab.net> wrote:
> Garrett Rooney wrote:
> > So I'm looking at some svn related httpd crashes we've been seeing on
> > svn.apache.org, and it turns out that the root cause seems to be
> > unbounded memory use in svn_repos_get_logs3. We construct a history
> > object for each path, and walk backwards through them until we've
> > either run out of history or sent "limit" revisions back to the user.
> >
> > Unfortunately, this means we actually keep two pools and a history
> > object per path open, which for large numbers of paths (around 1200 in
> > the crashes i'm seeing) results in a LOT of memory being used (about
> > 512 megs in this case, at which point the process crashes because it
> > hits its per-process memory limit).
>
> Ouch.

Yeah, tell me about it.

> > I'm starting to think we might need some sort of system where we only
> > open a fixed number of history objects (and thus only need a fixed
> > number of scratch pools around for iterating through history) at a
> > time. We'd still need to track some info on a per-path basis (i.e.
> > what location in history we were at when we last had that node's
> > history opened) but it would be far less. More troubling would be the
> > performance hit of recreating the history objects lazily, but it's
> > certainly better to be slow than it is to use unbounded memory.
>
> Hrm. So, the additional calls to svn_fs_node_history() simply populate
> a structure. But you'd have to call svn_fs_history_prev() twice, once
> to get to the history location you reported in your last iteration, and
> then again to reach the next new and interesting location. That could
> really get costly in terms of performance, I imagine. Especially over
> the likes of 1200 paths. Still, memory is a fixed resource; Time not so
> much.

The worse part is that because we need to check ALL the histories each
time through the loop, we are stuck doing that open/prev/prev dance N
revs x M paths we can't keep open all the time. So as soon as you go
over this magical limit you'll hit a wall and performance goes to
hell. Unless I'm missing something here...

> > Additionally, it seems like we can get away with not opening a
> > revision root per path, since they are all rooted at the same revision
> > anyway.
>
> Yep. +1 on that part for sure.

I'll clean that up and commit it.

-garrett

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Feb 8 23:49:05 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.