[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Subversion's use of Berkeley DB [#11511]

From: Keith Bostic <bostic_at_abyssinian.sleepycat.com>
Date: 2004-12-08 21:58:17 CET

> From: Garrett Rooney <rooneg@electricjellyfish.net>
> Keith Bostic wrote:
>> 2. Subversion users are occasionally seeing "out of memory
>> errors". The Subversion code has recently added an error
>> callback routine, so future occurrences of this problem
>> should result in the detailed Berkeley DB error message being
>> available for later debugging.
>> Given the default 256KB cache size, and using, for example,
>> 16KB database page sizes, 8 threads of control in the
>> database at the same time, each grabbing 2 pages, will run
>> the cache out of room, resulting in this failure. So,
>> increasing the cache size may very well fix this problem.
>> Action Items:
>> None at this time.
> So you're saying that the amount of cache used is proportional to the
> number of concurrent threads accessing the db, and if you run out of
> cache things just don't work? That seems less than optimal...

Well, there's a difference between "not working" and "failing".

This error will force the application to abort a transaction,
or a non-transactional read will fail, returning an error. The
system as a whole will continue to work correctly. (If the
application was doing non-transactional database modifications,
of course, it might not be able to recover, but that's why it's
using transactions in the first place, to handle these kinds of
failure without having to halt the system.)

> Is there
> anyway to make it allocate more room to the cache dynamically?

Not really -- here's a longer answer than you probably wanted,
but the argument goes something like this:

In order to support a multi-process application model, Berkeley
DB uses filesystem-backed or system-memory-backed shared memory
segments. The DB cache is one of those segments (as are the DB
locking, logging and transaction subsystems).

The first idea on how to allocate more room to a shared memory
segment is to simply extend the segment. Unfortunately, the OS
doesn't guarantee the ability to extend any specific memory
mapping, and if 2 or 3 processes extend the segment and start
using the extended segment, and 10 minutes later some random
process can't extend the segment, we're going to have to shut
everything down to fix the problem.

The second idea is to allocate a new, larger shared memory segment
and copy the old segment to the new segment. Unfortunately, the
shared memory segments contain mutexes, and mutexes cannot be
copied from one memory location to another on some architectures.
(Even if Berkeley DB could do the copy, it would have to block
all threads of control out of the database environment while it
did the copy, which isn't acceptable for non-stop systems.)

So, to increase the size of a segment, Berkeley DB would have
to allocate a new chunk of shared memory and integrate it into
the running database environment. That's not difficult to do,
except for the data structures. Berkeley DB has lots of linked
lists running through the shared memory (for example, lists of
hash buckets, lockers, page buffers, whatever).

Those linked lists use base address + offset addressing, that
is, the address of an element is the mapped-in address of the
segment plus an offset in the segment. To support multiple
segments in a single linked list, you have to convert from
base + offset addressing to offset + offset addressing, that
is, each pointer to the next element of the list has to contain
a way to find the base address of the segment it's in, plus an
offset in that segment, instead of just an offset to a known
base address. That's going to be slower, and Berkeley DB runs
through those linked lists a lot, usually while holding mutexes.


Keith Bostic bostic@sleepycat.com
Sleepycat Software Inc. keithbosticim (ymsgid)
118 Tower Rd. +1-781-259-3139
Lincoln, MA 01773 http://www.sleepycat.com

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Dec 8 21:59:22 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.