[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Inexcusable BDB upgrade triple blunder

From: Branko Čibej <brane_at_xbc.nu>
Date: 2006-02-28 06:47:07 CET

kfogel@collab.net wrote:
> Branko Čibej <brane@xbc.nu> writes:
>
>> Manuzhai wrote:
>>
>>>> What makes me angry about this is the same thing that made me angry
>>>> the last time I tripped over a rather similar database-recovery botch.
>>>>
>>> Well, if you want to prevent it from happening another time, you
>>> might just want to hop onto the fsfs bandwagon. It's become the
>>> default for 1.2, and for a reason - seems much more stable, no more
>>> db-recovery.
>>>
>> Hm, or the 1.3.(1?) bandwagon with bdb-4.4 -- no more db-recovery
>> there, either. :)
>>
>
> Speaking of which... where do we stand with 1.3.1 and BDB, Brane? Is
> the r18144-etc entry in STATUS ready for review and voting, or are any
> other changes still coming in?
>
The state of the world is that I'm fighting a cold (or is it avian flu?)
that's got my brain even more stuffed up than usual, and I'd rather not
touch this until I'm reasonably certain that I'm back to my normal IQ
level :( higher than right now, in case there's any doubt, thanks :)

> For what it's worth, I got a call from Keith Bostic the other day
> (which I don't think he'd mind my reporting about in a public forum),
> basically saying that the acquisition by Oracle does not change
> anything -- Sleepycat remains available to help in any way they can
> with the SVN integration of BDB 4.4. It didn't sound like you were
> stuck on anything that needed consultation with Sleepycat, but I
> thought I'd mention it just in case.
>
No, I don't need any help from Sleepycat.

I would, however, appreciate some help from the list.

Here's the problem:

    * We have a global cache of open BDB environment descriptors. This
      cache is of course allocated from a (global) pool, and each
      descriptor gets its own subpool. The cached descriptors are
      reference-counted.
    * Each svn_fs_t owns a handle to one of these descriptors. These
      handles (actually, their allocators and pool cleanups) manage the
      descriptor reference counts.

So far so good. Everything works beautifully, no memory leaks anywhere.
Sheer paradise ...

... except for one nasty problem. The global pool for the cache can be
created *after* the first svn_fs_t is allocated (from a different pool).
Because we have no requirement about library initialization order in
1.x, we can only create the cache when it's first needed, not, say, just
after apr_initialize.

The consequence is that, when we hit apr_terminate, the cache's pool can
be destroyed before the pool that contains the svn_fs_t that refers to
the cache; that's because APR destroys its (global) pools in LIFO order.
Unfortunately that means that during destruction of the svn_fs_t, we try
to access memory that's already been freed.

I've pretty much convinced myself that this can only happen during
apr_terminate, but that's not much help, because it can still
potentially cause a crash (or worse).

I can't change the pool cleanup order short of a) hacking the pool
structure directly (eek!), or b) requiring that svn_fs_initialize be
called right after apr_initialize (ook ... er, that is, impossible).

I've considered using malloc to allocate the descriptors, but i'd still
need a pool per descriptor for open file handles, utf8->native
translation of database paths, etc.

I'm sure that, with all the brainpower on this list, someone will come
up with a trivially elegant solution to this problem.

So ... any bright ideas?

-- Brane

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Feb 28 06:53:13 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.