On Wed, 06 Jun 2007, Ben Collins-Sussman wrote:
> We've got some sort of large memleak in 'svn update', a major
> regression between svn 1.4 and svn 1.5.
> Over at Google, we're playing around with Subversion on some very
> large trees. In particular, I've got one project here which is about
> 1GB of source code (2GB when checked out as a working copy), about
> 57,000 files and 2000 subdirs.
> I'm working on Windows XP. I do an 'svn update' on this working copy
> which updates a few hundred files. During this update, my svn 1.5
> client (built this week) grows to a resident size of about 300MB, then
> spikes up to 570MB during the final update tree-walk (where the whole
> tree is marked as being at HEAD).
> When I use a stock svn 1.4.3 client to do the same update, the memory
> usage slowly grows to 96MB, and stays there even during the final
> update tree-walk. Not great, but much more sane. No spiking.
The most likely location for a leak would be somewhere at or under the
svn_client__update_internal() call stack.
svn di \
The main differences are for:
a) Eliding of merge info in the WC
b) Depth argument handling
c) Preserved file extension handling
(b) isn't obviously suspect in this function, but there's potential
for a leak buried in the changes to the reporter code. (c) is fairly
simple, and also seems pretty unlikely.
WRT (a), I notice two potential scalability issues, the second
somewhat dependent upon the first:
1) children_with_mergeinfo_hash could potentially grow to be large.
2) We iterate over children with merge info, but don't use a sub-pool.
I'm attaching a patch for #2, but I don't think we can easily fix #1.
Assuming Google's tree doesn't have much/any merge info, I doubt this
will have much impact on the memory footprint, unless the leak is
buried under svn_client__elide_mergeinfo().
We should probably make the reporter code the next stop
(svn_ra_do_update2 and the call stack underneath it).
Received on Thu Jun 7 02:20:36 2007
- application/pgp-signature attachment: stored