[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

[PATCH] RA sessions should always be loaded from the same pool (was Re: Crash of svn-1.2.x in svn_ra_open())

From: Nix <nix_at_esperi.org.uk>
Date: 2005-07-27 18:34:14 CEST

Back on Sun, 24 Jul 2005 I mentioned this apparent bug on the users
list:
> I've seen this as long as I've been using 1.2.x: as you might imagine, a
> crash in this routine is pretty disruptive :( notably, it stops me
> doing remote updates of DAV sources, like, well, subversion's. :)

I've tracked this down, but you might not have read that post (it
garnered no comment whatsoever), so here's some of the same info again,
a little more diagnostics, and a putative fix (I know it fixes the bug,
but not if the bug is fixed the right way).

The immediate symptoms are a crash like this, with multiple versions of
APR and on diverse glibc/linux-2.6 architectures:

> amaterasu 477 /usr/packages/subversion/subversion% gdb --args /usr/bin/.svn ls http://svn.collab.net/repos/svn/tags/1.2.1
[...]
> 0x7024495c in memcmp () from /lib/libc.so.6
> (gdb) bt
> #0 0x7024495c in memcmp () from /lib/libc.so.6
> #1 0x70125968 in find_entry (ht=0x41488, key=0x70615a78, klen=15, val=0x40fa0) at tables/apr_hash.c:260
> #2 0x70125b38 in apr_hash_set (ht=0x41488, key=0x70615a78, klen=-1, val=0x40fa0) at tables/apr_hash.c:338
> #3 0x706102d4 in svn_ra_dav__open (session=0x72ca8, repos_URL=0x72bd8 "http://svn.collab.net/repos/svn/tags/1.2.1", callbacks=0x72c18, callback_baton=0x72c30, config=0x400, pool=0x714b8)
> at subversion/libsvn_ra_dav/session.c:710
> #4 0x7009d620 in svn_ra_open (session_p=0xeff1cdd8, repos_URL=0x72bd8 "http://svn.collab.net/repos/svn/tags/1.2.1", callbacks=0x72c18, callback_baton=0x72c30, config=0x40eb0,
> pool=0x714b8) at subversion/libsvn_ra/ra_loader.c:277
> #5 0x7004a03c in svn_client__open_ra_session (ra_session=0xeff1cdd8, base_url=0x72bd8 "http://svn.collab.net/repos/svn/tags/1.2.1", base_dir=0x0, base_access=0x0, commit_items=0x0,
> use_admin=0, read_only_wc=0, ctx=0x40e88, pool=0x714b8) at subversion/libsvn_client/ra.c:288
> #6 0x7004af10 in svn_client__ra_session_from_path (ra_session_p=0xeff1ceb4, rev_p=0xeff1ceb0, url_p=0xeff1ceac, path_or_url=0x714f0 "http://svn.collab.net/repos/svn/tags/1.2.1",
> peg_revision_p=0xeff1cfa8, revision=0xeff1d1b0, ctx=0x72c08, pool=0x714b8) at subversion/libsvn_client/ra.c:887
> #7 0x70047f98 in svn_client_ls2 (dirents=0xeff1cf50, path_or_url=0x714f0 "http://svn.collab.net/repos/svn/tags/1.2.1", peg_revision=0xeff1cfa8, revision=0xeff1d1b0, recurse=0,
> ctx=0x40e88, pool=0x714b8) at subversion/libsvn_client/ls.c:88
> #8 0x000163fc in svn_cl__ls (os=0x40ab8, baton=0xeff1d0e0, pool=0x40a80) at subversion/clients/cmdline/ls-cmd.c:325
> #9 0x00017d2c in main (argc=93184, argv=0x25000) at subversion/clients/cmdline/main.c:1449

The cause of this is that a key in the auth_baton hash has been
invalidated and points at a no-longer-valid address. How, you might ask,
since all the keys are manifest constants? Alas, that's pretty clear if
you stick a breakpoint on apr_hash_sets of this hash table and have a
look at the memory map each time up until the crash:

705e8000-705fe000 r-xp 00000000 fe:03 409821 /usr/packages.bin/subversion/1.2.1/lib/libsvn_ra_dav-1.so.0.0.0
705fe000-7060e000 ---p 00016000 fe:03 409821 /usr/packages.bin/subversion/1.2.1/lib/libsvn_ra_dav-1.so.0.0.0
7060e000-70610000 rwxp 00016000 fe:03 409821 /usr/packages.bin/subversion/1.2.1/lib/libsvn_ra_dav-1.so.0.0.0

705e8000-705fe000 r-xp 00000000 fe:03 409821 /usr/packages.bin/subversion/1.2.1/lib/libsvn_ra_dav-1.so.0.0.0
705fe000-7060e000 ---p 00016000 fe:03 409821 /usr/packages.bin/subversion/1.2.1/lib/libsvn_ra_dav-1.so.0.0.0
7060e000-70610000 rwxp 00016000 fe:03 409821 /usr/packages.bin/subversion/1.2.1/lib/libsvn_ra_dav-1.so.0.0.0

70600000-70616000 r-xp 00000000 fe:03 409821 /usr/packages.bin/subversion/1.2.1/lib/libsvn_ra_dav-1.so.0.0.0
70616000-70626000 ---p 00016000 fe:03 409821 /usr/packages.bin/subversion/1.2.1/lib/libsvn_ra_dav-1.so.0.0.0
70626000-70628000 rwxp 00016000 fe:03 409821 /usr/packages.bin/subversion/1.2.1/lib/libsvn_ra_dav-1.so.0.0.0

Whoops. That's invalidated all the manifest constants that came from the
old shared library quite efficiently, and the next access to that
auth_baton will crash SVN (and does).

A smoking gun from a run with LD_DEBUG=files set:

      2256: calling init: /usr/lib/libsvn_ra_dav-1.so.0
      2256:
      2256: opening file=/usr/lib/libsvn_ra_dav-1.so.0 [0]; direct_opencount=1
[...]
      2256: calling fini: /usr/lib/libsvn_ra_dav-1.so.0 [0]
[... ra_dav's auth_baton is now filled with dangling pointers, but nothing accesses
 them until ...]
      2256: calling init: /usr/lib/libsvn_ra_dav-1.so.0
      2256:
      2256: opening file=/usr/lib/libsvn_ra_dav-1.so.0 [0]; direct_opencount=1
[crash almost immediately afterwards]

This is happening because svn_client__open_ra_session() initiates an
apr_dso_load() of libsvn_ra_dav.so, and it is called repeatedly with
*different pools*. The net effect is to load and unload it repeatedly,
and POSIX certainly doesn't guarantee that it gets loaded at the same
address each time. (The address-space randomization work in OpenBSD,
Fedora, and recently the mainline kernel tries to *ensure* that it never
gets loaded at the same address more than once, but it was never
certain to work even under other POSIX-conforming OSes.)

It looks like subversion must be especially careful not to call
svn_ra_open() or svn_fs_open() (or anything else which eventually loads
DSOs) more than once if they use different pools. (I know that under
POSIX-like OSes, calling apr_dso_load() more than once is harmless: I'm
not sure about Win32, and the APR docs are useless on this point as on
so many others.)

It would be good if APR supported disabling unloading of DSOs entirely:
SVN doesn't need it, and its effects are only damaging. (Personally, I
don't think dlclose() is *ever* a good idea, because of things like
this.)

So as a possibly more maintainable alternative, the hack below fixes
apr_dso_load() callers to always use tiny dedicated pools which are
never deallocated. It works (`svn ls
http://svn.collab.net/repos/svn/tags/1.2.1' no longer crashes, and `make
check' still passes) but it is possibly not the right thing to do. (I'm
a subversion and APR coding neophyte and so don't have the right
instincts in this area yet.)

2005-07-27 Nix <nix@esperi.org.uk>

        * libsvn_ra/ra_loader.c (load_ra_module): Allocate a never-freed
        pool for DSO loading.
        * libsvn_fs/fs-loader.c (load_module): Likewise.

diff -dpurN subversion-orig/subversion/libsvn_fs/fs-loader.c subversion/subversion/libsvn_fs/fs-loader.c
--- subversion-orig/subversion/libsvn_fs/fs-loader.c 2005-07-02 14:29:01.000000000 +0100
+++ subversion/subversion/libsvn_fs/fs-loader.c 2005-07-27 15:55:11.000000000 +0100
@@ -87,6 +87,12 @@ load_module (fs_init_func_t *initfunc, c
     const char *libname;
     const char *funcname;
     apr_status_t status;
+ static apr_pool_t *dedicated_dso_pool = NULL;
+
+ if (!dedicated_dso_pool)
+ {
+ dedicated_dso_pool = svn_pool_create (NULL);
+ }
 
     libname = apr_psprintf (pool, "libsvn_fs_%s-%d.so.0",
                             name, SVN_VER_MAJOR);
@@ -94,8 +100,8 @@ load_module (fs_init_func_t *initfunc, c
 
     /* Find/load the specified library. If we get an error, assume
        the library doesn't exist. The library will be unloaded when
- pool is destroyed. */
- status = apr_dso_load (&dso, libname, pool);
+ SVN is shut down. */
+ status = apr_dso_load (&dso, libname, dedicated_dso_pool);
     if (status)
       return SVN_NO_ERROR;
 
diff -dpurN subversion-orig/subversion/libsvn_ra/ra_loader.c subversion/subversion/libsvn_ra/ra_loader.c
--- subversion-orig/subversion/libsvn_ra/ra_loader.c 2005-07-02 14:28:56.000000000 +0100
+++ subversion/subversion/libsvn_ra/ra_loader.c 2005-07-27 15:55:23.000000000 +0100
@@ -125,6 +125,12 @@ load_ra_module (svn_ra__init_func_t *fun
     const char *funcname;
     const char *compat_funcname;
     apr_status_t status;
+ static apr_pool_t *dedicated_dso_pool = NULL;
+
+ if (!dedicated_dso_pool)
+ {
+ dedicated_dso_pool = svn_pool_create (NULL);
+ }
 
     libname = apr_psprintf (pool, "libsvn_ra_%s-%d.so.0",
                             ra_name, SVN_VER_MAJOR);
@@ -132,7 +138,7 @@ load_ra_module (svn_ra__init_func_t *fun
     compat_funcname = apr_psprintf (pool, "svn_ra_%s_init", ra_name);
 
     /* find/load the specified library */
- status = apr_dso_load (&dso, libname, pool);
+ status = apr_dso_load (&dso, libname, dedicated_dso_pool);
     if (status)
       {
 #if 0
@@ -145,7 +151,7 @@ load_ra_module (svn_ra__init_func_t *fun
         /* Just ignore the error. Assume the library isn't present */
         return SVN_NO_ERROR;
       }
- /* note: the library will be unloaded at pool cleanup */
+ /* note: the library will be unloaded when SVN is shut down */
 
     /* find the initialization routines */
     if (func)

-- 
`Tor employs several thousand editors who they keep in dank
 subterranean editing facilities not unlike Moria' -- James Nicoll 
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Jul 27 19:12:29 2005

This is an archived mail posted to the Subversion Dev mailing list.