Larry Shatzer wrote:
> I am trying to check in a directory that has just under 6,000 files
> (98% are new, with the rest of the 2% with changes). The total size
> of the directory is around 5 megabytes. (lots and lots of small XML
> files).
Yeah, I would expect this commit to take about 48MB of memory,
possibly more.
[The rest of this is going to be more comprehensible to other
developers than to Larry.]
The problem is that a subpool is created for each file during the
first part of the commit, and it isn't destroyed until the "Sending
file contents..." part of the commit. (File data is sent after
directory data so that you can learn of conflicts sooner.) Since
subpools take up a minimum of 8K, you wind up using 8K * 6000 bytes of
memory--possibly more, due to various allocation inefficiencies.
Various theories about what subsystem is at fault:
1. The APR pool system, for using such a large minimum allocation
size.
2. The commit design, for not being streamy in this regard.
3. The ra_svn editor driver, for using a separate subpool for each
file.
Looking harder at #3: the commit_utils driver avoids this problem on
the client side by lumping all file-with-text-mod batons into a single
pool, which it can do because it knows exactly how long the files are
going to live. The ra_svn driver doesn't have that knowledge, but
perhaps it could do better by having a single reference-counted pool
for files. During an import (where file data is not held until the
end), the refcount would drop to zero and the pool would be cleared
after each file, but during a commit, all files would live in the same
pool. There's no way to know ahead of time whether a file is going to
have a text mod, though, so it couldn't be quite as efficient as the
client-side editor driver.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Mar 21 20:23:59 2004