Peter Lundblad found a thread-safety problem in the patch below.
(<embarrassment> We did think about thread-safety issues for previous
versions of the patch, versions that used a boolean instead of a
struct pointer.  But when we switched to the struct, those previous
thoughts became irrelevant.  Call it the mental equivalent of a cache
invalidation problem.)
Here's the thread issue:
In the block that starts 'if (APR_STATUS_IS_ENOTIMPL (apr_err))',
assume we assign enotimpl_handle_node = apr_palloc(), and then the
thread switches out.  Now another thread switches in and reaches the
earlier check, 'if (enotimpl_handle_node)'.  The new thread sees that
enotimpl_handle_node is not NULL, and proceeds into the conditional
body.  But the first thread never finished initializing
enotimpl_handle_node->handle, enotimpl_handle_node->next, etc.
Badness results!
Peter's coming up with a patch that uses the non-global caching system
that was already present in the code anyway.  Even without the
thread-safety issue, this would be a better solution.
-Karl
kfogel@collab.net writes:
> Background:
> 
> An attempt to load a 90GB dumpfile (of the ASF repository) into FSFS
> resulted in intolerable linear memory growth.  The problem turned out
> to be in the translation code.  Because APR was compiled without
> APR_HAS_XLATE, apr_xlate_open() kept returning APR_ENOTIMPL, which
> meant that there was no resulting xlate_handle to cache, which meant
> that instead of getting it out of the cache the next time, we'd try to
> generate it again... and again... and again... each attempt using up
> some more memory.  This is in libsvn_subr/utf.c:get_xlate_handle_node().
> 
> Justin Erenkrantz temporarily solved this at the ASF by recompiling
> APR with iconv (only to reveal another, slower memory leak, but that's
> a separate issue not discussed here).  We still needed a fix for
> APR_ENOTIMPL in Subversion, so Justin and Ben Collins-Sussman and I
> discussed a strategy, and came up with this patch.  They've reviewed
> prior versions of the patch, but not seen this final one, so they are
> absolved of responsibility for any brainos here.
> 
> Review welcome!
> 
> Thanks,
> -Karl
> 
> [[[
> Plug a memory leak by avoiding repeated failing attempts at apr_xlate setup.
> 
> Patch by: jerenkrantz
>           kfogel
>           sussman
> Suggested by: jerenkrantz
>               dberlin
>               ghudson
>               brane
>               striker
> 
> * subversion/libsvn_subr/utf.c
>   (enotimpl_handle_node): New file-static variable.
>   (get_xlate_handle_node): Use enotimpl_handle_node to determine
>   whether to even bother calling apr_xlate_open again.  Update and fix
>   doc string, which was wrong before anyway.
> ]]]
> 
> Index: subversion/libsvn_subr/utf.c
> ===================================================================
> --- subversion/libsvn_subr/utf.c	(revision 15373)
> +++ subversion/libsvn_subr/utf.c	(working copy)
> @@ -68,6 +68,17 @@
>     memory leak. */
>  static apr_hash_t *xlate_handle_hash = NULL;
>  
> +/* Non-NULL iff apr_xlate_open() ever returned APR_ENOTIMPL.  Since
> +   there would be no point ever calling apr_xlate_open() again in this
> +   process, we can save ourself some setup work (including memory
> +   allocations) by caching a handle_node to return in that case;
> +   enotimpl_handle_node->handle will be NULL, of course.  Note that
> +   apr_xlate_open() could return APR_ENOTIMPL because APR_HAS_XLATE is
> +   not defined, or for some other reason.  For our purposes it doesn't
> +   matter, we just want to remember that it is not implemented for
> +   this process. */
> +static xlate_handle_node_t *enotimpl_handle_node = NULL;
> +
>  /* Clean up the xlate handle cache. */
>  static apr_status_t
>  xlate_cleanup (void *arg)
> @@ -122,12 +133,13 @@
>      }
>  }
>  
> -/* Return an apr_xlate handle for converting from FROMPAGE to
> -   TOPAGE. Create one if it doesn't exist in USERDATA_KEY. If
> -   unable to find a handle, or unable to create one because
> -   apr_xlate_open returned APR_EINVAL, then set *RET to null and
> -   return SVN_NO_ERROR; if fail for some other reason, return
> -   error. */
> +/* Set *RET to a handle node for converting from FROMPAGE to TOPAGE,
> +   creating the handle node if it doesn't exist in USERDATA_KEY.
> +   If unable to find a handle, or unable to create one because
> +   apr_xlate_open returned APR_EINVAL or APR_ENOTIMPL, then
> +   *RET->handle will be NULL (and in the ENOTIMPL case, allocate *RET
> +   itself in a child of the top-level pool, not in POOL).  If fail for
> +   any other reason, return error. */
>  static svn_error_t *
>  get_xlate_handle_node (xlate_handle_node_t **ret,
>                         const char *topage, const char *frompage,
> @@ -199,6 +211,14 @@
>    if (userdata_key && xlate_handle_hash)
>      pool = apr_hash_pool_get (xlate_handle_hash);
>  
> +  /* If we failed due to ENOTIMPL before, then we'd certainly fail
> +     now, so just return without trying. */
> +  if (enotimpl_handle_node)
> +    {
> +      *ret = enotimpl_handle_node;
> +      return SVN_NO_ERROR;
> +    }
> +  
>    /* Try to create a handle. */
>    *ret = apr_palloc (pool, sizeof(xlate_handle_node_t));
>    apr_err = apr_xlate_open (&(*ret)->handle, topage, frompage, pool);
> @@ -229,6 +249,19 @@
>  
>    if (APR_STATUS_IS_EINVAL (apr_err) || APR_STATUS_IS_ENOTIMPL (apr_err))
>      {
> +      if (APR_STATUS_IS_ENOTIMPL (apr_err))
> +        {
> +          /* Remember for next time that we needn't bother trying.
> +             Use the top-level pool because we have no lifetime
> +             guarantees about our local pool parameter. */
> +          enotimpl_handle_node = apr_palloc (svn_pool_create (NULL),
> +                                             sizeof (*enotimpl_handle_node));
> +          enotimpl_handle_node->handle = NULL;
> +          enotimpl_handle_node->frompage = NULL;
> +          enotimpl_handle_node->topage = NULL;
> +          enotimpl_handle_node->next = NULL;
> +        }
> +
>        (*ret)->handle = NULL;
>        return SVN_NO_ERROR;
>      }
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Jul 20 23:03:49 2005