[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svnsync crashes on a huge commit

From: Ben Reser <ben_at_reser.org>
Date: Thu, 15 Aug 2013 20:15:53 -0700

On 7/28/13 1:52 AM, Ben Reser wrote:
> Even if the change that is resolving the problem on trunk isn't viable
> for backport, that still doesn't mean we won't fix the problem till
> 1.9.x, it just means a fix that will work for 1.8.x needs to be
> identified.

I looked into merging the trunk changes. It doesn't look pretty. There
are changes to the update report and inherited properties code that
create conflicts when merging the changes required to get use the
transition based parser code. If Bert wants to nominate the change to
the transition based parser I wouldn't object, but I'm not comfortable
figuring out the backport on this code myself. On the upside all of the
changes should be in private namespace so we shouldn't be prevented from
backporting.

I'm inclined to take a more conservative approach. The following patch
should fix most of this problem on 1.8.x without pulling the new XML
parsing implementation onto 1.8.x.

[[[
Index: subversion/libsvn_ra_serf/replay.c
===================================================================
--- subversion/libsvn_ra_serf/replay.c (revision 1513303)
+++ subversion/libsvn_ra_serf/replay.c (working copy)
@@ -147,10 +147,16 @@
       state == OPEN_FILE || state == ADD_FILE)
     {
       replay_info_t *info;
+ apr_pool_t *pool;

- info = apr_palloc(replay_ctx->dst_rev_pool, sizeof(*info));
+ if (state == OPEN_FILE || state == ADD_FILE)
+ pool = replay_ctx->file_pool;
+ else
+ pool = replay_ctx->dst_rev_pool;

- info->pool = replay_ctx->dst_rev_pool;
+ info = apr_palloc(pool, sizeof(*info));
+
+ info->pool = pool;
       info->parent = parser->state->private;
       info->baton = NULL;
       info->stream = NULL;
]]]

Not sure if this is really complete. I don't think the directory case
or the property case really should be using the dst_rev_pool. In fact
I really don't understand why the replay_ctx->file_pool exists as well
as the info->pool. I think we should probably rip out the
replay_ctx->file_pool, produce appropriate child pools on the info
struct and then use that. But I haven't done that work to see if it
works properly.

I used the following script to reproduce the problem:
[[[
#!/bin/bash

set -e

LOCAL_REPO=file://`pwd`/repo
REMOTE_REPO=https://src.chromium.org/chrome
SVNADMIN=$HOME/builds/svn-1.8.x/subversion/svnadmin/svnadmin
SVNSYNC=$HOME/builds/svn-1.8.x/subversion/svnsync/svnsync
SVNRDUMP=$HOME/builds/svn-1.8.x/subversion/svnrdump/svnrdump

if [ ! -e base.dump -a ! -d base.repo ]; then
  $SVNRDUMP dump -r0:17 "$REMOTE_REPO" > base.dump
fi

rm -rf repo
if [ ! -d base.repo ]; then
  $SVNADMIN create repo
  echo $'#!/bin/sh\nexit 0\n' > repo/hooks/pre-revprop-change
  chmod a+x repo/hooks/pre-revprop-change
  $SVNADMIN load repo < base.dump
  $SVNSYNC init --allow-non-empty "$LOCAL_REPO" "$REMOTE_REPO"
  cp -a repo base.repo
else
  cp -a base.repo repo
fi

$SVNSYNC sync "$LOCAL_REPO"
]]]

I used dump and saved a repo so I could simply repeat r18 and look at
the memory usage. Without the above patch the code used 400-500MB peak
for r18. With the patch it stayed under 50MB, with most of the peak
memory usage coming as the commit is built.

I'll look into further cleanup tomorrow unless Bert really wants to run
with the backport of the transition based code.
Received on 2013-08-16 05:16:29 CEST

This is an archived mail posted to the Subversion Dev mailing list.