[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: I'll help with db debugging

From: Jim Blandy <jimb_at_zwingli.cygnus.com>
Date: 2001-04-02 06:33:43 CEST

"Eric S. Raymond" <esr@thyrsus.com> writes:
> Chip Salzenberg <chip@valinux.com>:
> > I'll be glad to help with debugging the db crash. I've been wanting
> > to get into subversion for a while now, and there's no place better
> > for a newbie to start than with the ugly grotty stuff that no one
> > wants to do....
> Larry McVoy told me this weekend at the Kernel Hackers' Summit that
> Subversion is using db, a non-transparent binary format, to store critical
> state. He also said "There was cheering and shouting in the halls at
> Bitkeeper when we heard that news. Those <deleted>s just added a year
> to their development time."
> Larry is right. This choice was a major, major blunder that fills me
> with unease about the future of this project. What little you gain in
> performance you will lose in multiplied difficulties and schedule slips
> because corruption will be so much harder to detect and recover from
> Please reverse this bad choice *now*, before you get nibbled to death and
> bogged down in db-related problems.

Yes, we're using Berkeley DB for the repository storage. However, all
the keys and values are stored in a processor-independent,
human-readable format. So the repository can easily be dumped and
examined in any text editor using the stock Berkeley DB table dump
facility. In fact, Berkeley DB includes a program which acts as the
inverse of the dumping program, and can recreate a database from the
human-readable form.

I think Berkeley DB was a good choice as a back end. It is mature
software; it provides atomic transactions, crash recovery, and hot
backups; and it has a reputation for being efficient.

The biggest problem with Berkeley DB is that the database format has
historically changed frequently. To deal with that, we're planning to
write some very simple table dumping programs, which use no system
calls other than `seek' and `read', and simply keep an archive of
programs that can read the tables produced by every version of
Berkeley DB we've ever used. (Perhaps the Sleepycat people will help
us with this.)

As has been said before, I'm flattered that Bitkeeper considers us
such a threat. And I'm sorry that you feel we've made a poor choice.
And as always, the code is yours to improve.
Received on Sat Oct 21 14:36:27 2006

This is an archived mail posted to the Subversion Dev mailing list.