[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: general pool problem

From: Greg Stein <gstein_at_lyra.org>
Date: 2000-08-17 02:39:49 CEST

On Wed, Aug 16, 2000 at 03:47:00PM -0500, Karl Fogel wrote:
> Folks (Greg especially, since he has lots of APR experience):
>
> I'm noticing a general problem, annoying though not fatal. It seems
> that any function returning (svn_error_t *) must take a pool argument,
> because the svn_error constructor needs a pool to allocate from.

Correct.

There are two possibilities here:

*) the callee uses a pool that they have stashed internally inside one of
   the other arguments which were passed
*) the caller provides the pool to use for the svn_error

Problem: only the caller really knows the proper lifetime of that error, and
therefore which pool to use for its allocation.

Resulting problem: if you pass a sub-pool because you want to blow away some
of the callee's allocations (e.g. before a second iteration across that
callee), AND the callee uses the sub-pool for allocating the error... oops.
You probably have an improper lifetime on the svn_error now.

> This means that, like, virtually everything in Subversion has to take
> a pool argument, even the delta-walker callback functions, for
> example. It's not a huge pain, but it means adding an extra argument
> to many functions.

Yup.

> Is this just the way it has to be? Is this the price of the
> svn_error_t system? If so, do you think we would be better off using
> statically sized errors from the user-reserved range of apr_status_t
> values?
>
> I like the wealth of information preserved by svn_error chains, and
> would hate to give it up, but it is costing us in complexity somewhat.

I like the chains also and would hate to lose them.

In most cases, you want to pass a pool anyhow, because the callee is
typically going to have to alloc something in one way or another. But then
you run into the lifetime issue mentioned above.

Here is where Dan's suggestion comes into play: can we specify a pool to be
used for any error allocations? And my response: how do we communicate that
without using a global variable?

Suggestion:

At svn_initialize() time, it is passed a pool to use for *all* of its
allocations. Sub-pools may be created as necessary, but that is THE parent
pool for all allocations. Accordingly, this is the pool to contain all SVN
error allocations.

Question now: how to find that pool? How to avoid passing *two* pools to
functions (one for its allocs, one for errors) ?

Answer:

*) use apr_set_userdata(top_pool, "svn-error-pool", ap_null_cleanup,
   top_pool) in the svn_initialize() function
*) create svn_create_subpool() which creates the subpool and then calls
   apr_get_userdata() to fetch the error pool, then apr_set_userdata() to
   set it on the new subpool.
*) in the svn_create_error() function (and associated creation funcs) use
   apr_get_userdata() on the passed pool to get the error pool for
   allocating the error structures.

With this scheme, you can pass any pool into the error functions, and they
will make sure to allocate the error with the proper lifetime.

[ if we find a case where we ignore an error chain, then we'll need to
  decide whether the resulting error allocs are unbounded; if so, then we
  may need to create an error sub-pool which would allow a caller to clear
  the error subpool if necessary. ]

On the general problem of passing pools: in some cases, the function call
typically takes an object (i.e. the function is actually a method for that
object). The pool that the object used for itself can usually be stored
within the object, allowing any sub-allocs to have the same lifetime as the
object itself.

Some function semantics allocate and return something. The caller will need
to pass a pool for determining the returned-items' lifetimes.

Some functions need to allocate very temporary storage. Generally, those
temp items are simply allocated from any old handy pool. If the amount of
temporary allocation can be unbounded, then it is usally best to formalize
those allocations and provide a way for the caller to manage that temporary
data. (e.g. pass in a temp pool which the caller can clear; or the function
creates a subpool that it destroys on exit)

I hope that clarifies some of the design points of using pools. Of course,
please feel free (anybody) to ask more questions, raise concerns, etc.

Cheers,
-g

ps. note the userdata concept is a sneaky kind of global. however, its
    association with a pool infers a per-thread nature on it. note that most
    APR objects (files, threads, etc) can also have userdata associated with
    them. typically, that userdata is associated with the pool the object
    was allocated from (rather than an internal, distinct structure) which
    means you could see conflicts between pool-userdata and file-userdata.
    not a problem usually, but something to be aware of. in the future, the
    objects' userdata may be separated.

-- 
Greg Stein, http://www.lyra.org/
Received on Sat Oct 21 14:36:06 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.