[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: memory-leak - svn::repos instances are not destroyed properly

From: Stefan Sperling <stsp_at_elego.de>
Date: Fri, 14 Oct 2011 13:17:49 +0200

On Fri, Oct 14, 2011 at 12:33:08AM +0200, Max Voit wrote:
> Hi,
>
> developing an application dealing with many repositories the existence
> of paths within that repositories had to be checked.
> Using something like:
>
> my $repos = SVN::Repos::open($localpath) or die "no such repo";
> my $fs = $repos->fs;
>
> $ispath =
> $repos->fs->revision_root( $fs->youngest_rev)->is_dir($path);
>
> undef($fs);
> undef($repos);
>
> resulted in a persistent memory usage of approx. 300MB for ~ 3000 calls,
> though undefing the references of the objects.
>
> Attached is a small sample script, showing the problem with 1000
> calls - resulting in 200M with nothing but the directory in question in
> the repository. Repository-size seems to matter. (the application uses
> near productive repos)
>
> running with
> libsvn1 1.6.6dfsg-2ubuntu1.3
> libsvn-perl 1.6.6dfsg-2ubuntu1.3
> perl 5.10.1-8ubuntu2.1

The perl bindings don't abstract away memory pool handling.
If you don't pass a pool argument to fs->revision_root(),
it will use the global pool, which can never be cleared.

You need to use an iteration pool in your script and clear it after
each iteration of the for-loop.
See http://subversion.apache.org/docs/community-guide/conventions.html#apr-pools

A good perl code example is this commit to git.git; see the change
to the while loop at the bottom, which creates a new default subpool
for functions to use, and clears it after each iteration.

commit 5d17d765ca754f3e9d87110394a0a397ea7ac2cf
Author: Stefan Sperling <stsp_at_elego.de>
Date: Mon Sep 24 12:57:40 2007 +0200

    Fix pool handling in git-svnimport to avoid memory leaks.
    
    - Create an explicit one-and-only root pool.
    - Closely follow examples in SVN::Core man page.
      Before calling a subversion function, create a subpool of our
      root pool and make it the new default pool.
    - Create a subpool for looping over svn revisions and clear
      this subpool (i.e. it mark for reuse, don't decallocate it)
      at the start of the loop instead of allocating new memory
      with each iteration.
    
    See http://marc.info/?l=git&m=118554191513822&w=2 for a detailed
    explanation of the issue.
    
    Signed-off-by: Stefan Sperling <stsp_at_elego.de>
    Signed-off-by: Junio C Hamano <gitster_at_pobox.com>

diff --git a/git-svnimport.perl b/git-svnimport.perl
index aa5b3b2..ea8c1b2 100755
--- a/git-svnimport.perl
+++ b/git-svnimport.perl
@@ -54,6 +54,7 @@ my $branch_name = $opt_b || "branches";
 my $project_name = $opt_P || "";
 $project_name = "/" . $project_name if ($project_name);
 my $repack_after = $opt_R || 1000;
+my $root_pool = SVN::Pool->new_default;
 
 @ARGV == 1 or @ARGV == 2 or usage();
 
@@ -132,7 +133,7 @@ sub conn {
         my $auth = SVN::Core::auth_open ([SVN::Client::get_simple_provider,
                           SVN::Client::get_ssl_server_trust_file_provider,
                           SVN::Client::get_username_provider]);
- my $s = SVN::Ra->new(url => $repo, auth => $auth);
+ my $s = SVN::Ra->new(url => $repo, auth => $auth, pool => $root_pool);
         die "SVN connection to $repo: $!\n" unless defined $s;
         $self->{'svn'} = $s;
         $self->{'repo'} = $repo;
@@ -147,11 +148,10 @@ sub file {
 
         print "... $rev $path ...\n" if $opt_v;
         my (undef, $properties);
- my $pool = SVN::Pool->new();
         $path =~ s#^/*##;
+ my $subpool = SVN::Pool::new_default_sub;
         eval { (undef, $properties)
- = $self->{'svn'}->get_file($path,$rev,$fh,$pool); };
- $pool->clear;
+ = $self->{'svn'}->get_file($path,$rev,$fh); };
         if($@) {
                 return undef if $@ =~ /Attempted to get checksum/;
                 die $@;
@@ -185,6 +185,7 @@ sub ignore {
 
         print "... $rev $path ...\n" if $opt_v;
         $path =~ s#^/*##;
+ my $subpool = SVN::Pool::new_default_sub;
         my (undef,undef,$properties)
             = $self->{'svn'}->get_dir($path,$rev,undef);
         if (exists $properties->{'svn:ignore'}) {
@@ -202,6 +203,7 @@ sub ignore {
 sub dir_list {
         my($self,$path,$rev) = @_;
         $path =~ s#^/*##;
+ my $subpool = SVN::Pool::new_default_sub;
         my ($dirents,undef,$properties)
             = $self->{'svn'}->get_dir($path,$rev,undef);
         return $dirents;
@@ -358,10 +360,9 @@ open BRANCHES,">>", "$git_dir/svn2git";
 
 sub node_kind($$) {
         my ($svnpath, $revision) = @_;
- my $pool=SVN::Pool->new;
         $svnpath =~ s#^/*##;
- my $kind = $svn->{'svn'}->check_path($svnpath,$revision,$pool);
- $pool->clear;
+ my $subpool = SVN::Pool::new_default_sub;
+ my $kind = $svn->{'svn'}->check_path($svnpath,$revision);
         return $kind;
 }
 
@@ -889,7 +890,7 @@ sub commit_all {
         # Recursive use of the SVN connection does not work
         local $svn = $svn2;
 
- my ($changed_paths, $revision, $author, $date, $message, $pool) = @_;
+ my ($changed_paths, $revision, $author, $date, $message) = @_;
         my %p;
         while(my($path,$action) = each %$changed_paths) {
                 $p{$path} = [ $action->action,$action->copyfrom_path, $action->copyfrom_rev, $path ];
@@ -925,14 +926,14 @@ print "Processing from $current_rev to $opt_l ...\n" if $opt_v;
 my $from_rev;
 my $to_rev = $current_rev - 1;
 
+my $subpool = SVN::Pool::new_default_sub;
 while ($to_rev < $opt_l) {
+ $subpool->clear;
         $from_rev = $to_rev + 1;
         $to_rev = $from_rev + $repack_after;
         $to_rev = $opt_l if $opt_l < $to_rev;
         print "Fetching from $from_rev to $to_rev ...\n" if $opt_v;
- my $pool=SVN::Pool->new;
- $svn->{'svn'}->get_log("/",$from_rev,$to_rev,0,1,1,\&commit_all,$pool);
- $pool->clear;
+ $svn->{'svn'}->get_log("/",$from_rev,$to_rev,0,1,1,\&commit_all);
         my $pid = fork();
         die "Fork: $!\n" unless defined $pid;
         unless($pid) {
Received on 2011-10-14 13:18:25 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.