[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: big memleak in svn 1.5

From: Daniel Rall <dlr_at_collab.net>
Date: 2007-06-07 02:33:00 CEST

On Wed, 06 Jun 2007, Daniel Rall wrote:

> On Wed, 06 Jun 2007, Ben Collins-Sussman wrote:
>
> > We've got some sort of large memleak in 'svn update', a major
> > regression between svn 1.4 and svn 1.5.
> >
> > Over at Google, we're playing around with Subversion on some very
> > large trees. In particular, I've got one project here which is about
> > 1GB of source code (2GB when checked out as a working copy), about
> > 57,000 files and 2000 subdirs.
> >
> > I'm working on Windows XP. I do an 'svn update' on this working copy
> > which updates a few hundred files. During this update, my svn 1.5
> > client (built this week) grows to a resident size of about 300MB, then
> > spikes up to 570MB during the final update tree-walk (where the whole
> > tree is marked as being at HEAD).
> ...
> > When I use a stock svn 1.4.3 client to do the same update, the memory
> > usage slowly grows to 96MB, and stays there even during the final
> > update tree-walk. Not great, but much more sane. No spiking.
> ...

Ben, are you doing this testing over ra_dav, or ra_local?

> The most likely location for a leak would be somewhere at or under the
> svn_client__update_internal() call stack.
>
> R='http://svn.collab.net/repos/svn'
> svn di \
> $R/branches/1.4.x/subversion/libsvn_client/update.c \
> $R/trunk/subversion/libsvn_client/update.c
>
> The main differences are for:
>
> a) Eliding of merge info in the WC
> b) Depth argument handling
> c) Preserved file extension handling
>
> (b) isn't obviously suspect in this function, but there's potential
> for a leak buried in the changes to the reporter code. (c) is fairly
> simple, and also seems pretty unlikely.
>
> WRT (a), I notice two potential scalability issues, the second
> somewhat dependent upon the first:
>
> 1) children_with_mergeinfo_hash could potentially grow to be large.
> 2) We iterate over children with merge info, but don't use a sub-pool.
>
> I'm attaching a patch for #2, but I don't think we can easily fix #1.
> Assuming Google's tree doesn't have much/any merge info, I doubt this
> will have much impact on the memory footprint, unless the leak is
> buried under svn_client__elide_mergeinfo().

Patch wasn't quite right, attaching a corrected revision.

> We should probably make the reporter code the next stop
> (svn_ra_do_update2 and the call stack underneath it).

Potential problem areas on the client side for the reporter's call
stack are quite a bit different over ra_dav than ra_local.

Over ra_dav, I noticed quite a few other changes (e.g. cancellation
improvements, refactoring, etc.).

  • application/pgp-signature attachment: stored
Received on Thu Jun 7 02:33:10 2007

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.