[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

[PATCH] Fix memory leak in SWIG Python bindings

From: Trent Nelson <trent_at_snakebite.org>
Date: Mon, 23 Jan 2012 15:07:20 -0500

    Well, after a week of blood, sweat and tears, I finally tracked down the
    memory leak that's been slowly chipping away at my sanity for the past
    year.

    It's a pretty severe leak that affects the Python SWIG bindings: once
    allocated, an svn_merge_range_t object can never be freed. This is
    due to a missing Py_DECREF call in swigutil_py.c's convert_rangelist
    method.

    The range lists can be huge, too, which makes this quite a nasty leak.
    I was seeing up to 750KB of RSS being eaten up on each invocation of
    svn_mergeinfo_parse() against non-trivial, real-world svn:mergeinfo
    data (such as that on /subversion/trunk, for example).

    Any of the mergeinfo methods that return a range list (which almost all
    do) are affected. I haven't looked at the Perl or Ruby bindings in any
    capacity.

    Patch against trunk attached.

    Regards,

        Trent.

[[

Fix a memory leak in convert_rangelist by *always* Py_DECREF'ing the object
returned by convert_to_swigtype (a SWIG wrapper around svn_merge_range_t)
once we've established it's not NULL.

This is required because PyList_Append takes ownership of references passed
in (i.e. calls Py_INCREF). Prior to this fix, the reference counts of any
svn_merge_range_t objects would always be off-by-one, preventing them from
ever being garbage collected, having dire effects on the memory usage of
long-running programs calling svn_mergeinfo_parse() on real-world data.

Patch by: Trent Nelson <trent_at_snakebite.org>
Tested on: FreeBSD, OS X, Windows.

* subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c:
  (convert_rangelist): Make sure we call Py_DECREF on the object returned
   from convert_to_swigtype. PyList_New might return NULL; check for that.
   Add 'apr_array_header_t *' and 'swig_type_info *' casts where necessary
   to suppress compiler warnings.

* subversion/bindings/swig/python/tests/mergeinfo.py:
  (get_svn_merge_range_t_objects): New helper method that returns a list of
   any 'svn_merge_range_t' objects being tracked by the garbage collector,
   which we use to detect memory leaks.
  (SubversionMergeinfoTestCase): Add two new tests to aid in detecting memory
   leaks. They essentially test the same thing in two different ways; if one
   fails, the other *should* fail. (Of course, if one fails and the other does
   not -- that's just as valuable, diagnostically, and points to an issue with
   the 'automatic pool memory management' logic.)
    (test_mergeinfo_leakage__incorrect_range_t_refcounts): New test. Verify
     svn_merge_range_t objects returned from svn_mergeinfo_parse() have the
     correct refcounts set.
    (test_mergeinfo_leakage__lingering_range_t_objects_after_del): New test.
     Verify that there are no lingering svn_merge_range_t objects after we
     explicitly delete the results returned from svn_mergeinfo_parse().

]]

[[
Index: subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c
===================================================================
--- subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c (revision 1234786)
+++ subversion/bindings/swig/python/libsvn_swig_py/swigutil_py.c (working copy)
@@ -653,15 +653,27 @@
 static PyObject *convert_rangelist(void *value, void *ctx, PyObject *py_pool)
 {
   int i;
+ int result;
+ PyObject *obj;
   PyObject *list;
- apr_array_header_t *array = value;
+ apr_array_header_t *array = (apr_array_header_t *)value;
 
   list = PyList_New(0);
+ if (list == NULL)
+ return NULL;
+
   for (i = 0; i < array->nelts; i++)
     {
       svn_merge_range_t *range = APR_ARRAY_IDX(array, i, svn_merge_range_t *);
- if (PyList_Append(list, convert_to_swigtype(range, ctx, py_pool)) == -1)
+
+ obj = convert_to_swigtype(range, (swig_type_info *)ctx, py_pool);
+ if (obj == NULL)
         goto error;
+
+ result = PyList_Append(list, obj);
+ Py_DECREF(obj);
+ if (result == -1)
+ goto error;
     }
   return list;
  error:
Index: subversion/bindings/swig/python/tests/mergeinfo.py
===================================================================
--- subversion/bindings/swig/python/tests/mergeinfo.py (revision 1234786)
+++ subversion/bindings/swig/python/tests/mergeinfo.py (working copy)
@@ -18,7 +18,7 @@
 # under the License.
 #
 #
-import unittest, os
+import unittest, os, sys, gc, itertools
 from svn import core, repos, fs
 import utils
 
@@ -29,6 +29,15 @@
     self.start = start
     self.end = end
 
+def get_svn_merge_range_t_objects():
+ """Returns a list 'svn_merge_range_t' objects being tracked by the
+ garbage collector, used for detecting memory leaks."""
+ return [
+ o for o in gc.get_objects()
+ if hasattr(o, '__class__') and
+ o.__class__.__name__ == 'svn_merge_range_t'
+ ]
+
 class SubversionMergeinfoTestCase(unittest.TestCase):
   """Test cases for mergeinfo"""
 
@@ -116,6 +125,54 @@
       }
     self.compare_mergeinfo_catalogs(mergeinfo, expected_mergeinfo)
 
+ def test_mergeinfo_leakage__incorrect_range_t_refcounts(self):
+ """Ensure that the ref counts on svn_merge_range_t objects returned by
+ svn_mergeinfo_parse() are correct."""
+ # When reference counting is working properly, each svn_merge_range_t in
+ # the returned mergeinfo will have a ref count of 1...
+ mergeinfo = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
+ for (path, rangelist) in mergeinfo.items():
+ # ....and now 2 (incref during iteration of rangelist)
+
+ for (r, i) in zip(rangelist, itertools.count()):
+ # ....and now 3 (incref during iteration of each range object)
+
+ refcount = sys.getrefcount(r)
+ # ....and finally, 4 (getrefcount() also increfs)
+ expected = 4
+
+ # Note: if path and index are not '/trunk' and 0 respectively, then
+ # only some of the range objects are leaking, which is, as far as
+ # leaks go, even more impressive.
+ self.assertEquals(refcount, expected, (
+ "Memory leak! Expected a ref count of %d for svn_merge_range_t "
+ "object, but got %d instead (path: %s, index: %d). Probable "
+ "cause: incorrect Py_INCREF/Py_DECREF usage in libsvn_swig_py/"
+ "swigutil_py.c." % (expected, refcount, path, i)))
+
+ del mergeinfo
+ gc.collect()
+
+ def test_mergeinfo_leakage__lingering_range_t_objects_after_del(self):
+ """Ensure that there are no svn_merge_range_t objects being tracked by
+ the garbage collector after we explicitly `del` the results returned
+ by svn_mergeinfo_parse(). We disable gc before the svn_mergeinfo_parse
+ call and force an explicit collection cycle straight after the `del`;
+ if our reference counts are correct, the allocated svn_merge_range_t
+ objects will be garbage collected and thus, not appear in the list of
+ objects returned by gc.get_objects()."""
+ gc.disable()
+ mergeinfo = core.svn_mergeinfo_parse(self.TEXT_MERGEINFO1)
+ del mergeinfo
+ gc.collect()
+ lingering = get_svn_merge_range_t_objects()
+ self.assertEquals(lingering, list(), (
+ "Memory leak! Found lingering svn_merge_range_t objects left over from "
+ "our call to svn_mergeinfo_parse(), even though we explicitly deleted "
+ "the returned mergeinfo object. Probable cause: incorrect Py_INCREF/"
+ "Py_DECREF usage in libsvn_swig_py/swigutil_py.c. Lingering objects:\n"
+ "%s" % lingering))
+
   def inspect_mergeinfo_dict(self, mergeinfo, merge_source, nbr_rev_ranges):
     rangelist = mergeinfo.get(merge_source)
     self.inspect_rangelist_tuple(rangelist, nbr_rev_ranges)
]]

Received on 2012-01-23 21:08:01 CET

This is an archived mail posted to the Subversion Dev mailing list.