[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Random serf checkout failures

From: C. Michael Pilato <cmpilato_at_collab.net>
Date: Mon, 5 Nov 2012 15:39:25 -0500

On 11/05/2012 01:33 PM, C. Michael Pilato wrote:
> On 11/05/2012 11:49 AM, Lieven Govaerts wrote:
>> My current test with svn trunk & apache+mod_dav_svn 1.6.12 does not
>> reproduce this issue.
>>
>> What I am seeing is that svn is using a lot of memory, up to 1GB.
>> Currently looking at the root cause there.
>>
>> First clue: many directories are not closed until the call to
>> close_all_dirs in libsvn_ra_serf/update.c:finish_report. This means
>> that for the most part of the checkout, the associated info object is
>> kept in memory.
>
> My debugging indicates that when close_all_dirs() is called, there are a
> slew of unclosed directories which each have a non-zero ref_count. Save for
> the root directory of the edit (/tags), fetch_props == TRUE,
> propfind_handler->done == TRUE, and tag_closed == TRUE for all open child
> directories. So I'm looking around at how file references are managed to
> see if anything has leaked. I might have inadvertently botched something
> when adding the pull-contents-from-the-wc-cache logic.

So ... if I'm reading the ra_serf code correctly (and let me tell you,
there's a good chance I'm not), the only time we opportunistically close
directory batons is when we're processing the current list of completed file
fetches. See the code which follows this comment:

          /* If we have a valid directory and
           * we have no open items in this dir and
           * we've closed the directory tag (no more children can be added)
           * and either:
           * we know we won't be fetching props or
           * we've already completed the propfind
           * then, we know it's time for us to close this directory.
           */

That code is all encompassed within an outer loop that looks like so:

      /* prune our fetches list if they are done. */
      done_list = report->done_fetches;
      while (done_list)
        {

And all that is nested within yet another loop that runs until the REPORT
response is all parse and all auxiliary GETs and PROPFINDs are finished.

So, back to the opportunistic directory closure: *if* we've finished any
GETs since the last time we checked, and if ${STUFF_IN_THAT_BIG_COMMENT},
then we try to close the parent directory of the file we fetched, plus any
other close-able parent directories thereof.

But what about when there are no file content fetches to manage?

Using a trunk client and 1.7.x server, I just tried to export a tree with
15k directories (10 wide at the top, 15 or so below that, and 100 or so
below those) and no files. 153 Mb of memory peak usage, and not a single
directory closure until the whole tree had been processed/PROPFINDed/etc.

Unless I'm mistaken, we will delay until the very end the closure of every
directory (and by refcount-management-extension, every parent thereof) for
which we don't execute at least one GET for an immediately child thereof. I
expect this situation is not new to trunk. (In fact, repeating the above
export with a 1.7 client, I see a peak memory using of 458 Mb!) Memory
usage improvements in trunk help matters. But then, the feature which
causes ra_serf to use local file contents where it could avoid hitting the
network will, I expect, cause still more instances of directories for which
no immediate file children's contents will be fetched.

Philip, Lieven, somebody -- care to double-check my thinking here? I'm only
80% confident in my findings at the moment.

-- 
C. Michael Pilato <cmpilato_at_collab.net>
CollabNet   <>   www.collab.net   <>   Enterprise Cloud Development

Received on 2012-11-05 21:40:04 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.