[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: I'll help with db debugging

From: Arved Sandstrom <Arved_37_at_chebucto.ns.ca>
Date: 2001-04-02 02:58:09 CEST

On Friday 05 January 2001 03:54, Eric S. Raymond wrote:
> Chip Salzenberg <chip@valinux.com>:
> > I'll be glad to help with debugging the db crash. I've been wanting
> > to get into subversion for a while now, and there's no place better
> > for a newbie to start than with the ugly grotty stuff that no one
> > wants to do....
>
> Larry McVoy told me this weekend at the Kernel Hackers' Summit that
> Subversion is using db, a non-transparent binary format, to store critical
> state. He also said "There was cheering and shouting in the halls at
> Bitkeeper when we heard that news. Those <deleted>s just added a year
> to their development time."
>
> Larry is right. This choice was a major, major blunder that fills me
> with unease about the future of this project. What little you gain in
> performance you will lose in multiplied difficulties and schedule slips
> because corruption will be so much harder to detect and recover from
>
> Please reverse this bad choice *now*, before you get nibbled to death and
> bogged down in db-related problems.

I'm just monitoring this list because I picked up on it as a good example of
open-source process; it's giving me useful ideas elsewhere. I am otherwise
uninvolved, other than being keen on seeing Subversion succeed.

Couldn't help throwing in my 1.2 cents Canadian on this one, though. To start
with, exactly what is the magical (transparent non-binary) format that will
make it _easy_ to detect corruption and recover from same? I can throw out a
few guesses, but I won't.

Number two, a team that's prone to writing code that garbages up a DB is
going to be prone to writing code that garbages up a text (non-binary)
format. For all but a very few situations I can tell you what my remedial
action would be in both cases - go to the repository backup, and rollback the
Subversion repository to a known state. Just like with any other source
control system. At some time in the future when I am fortunate enough to do
CM with Subversion I can assure everyone that that backup and recovery
strategy will not change.

I don't think either binary formats or typical DBMSs are that shaky. I also
don't think text formats are that robust, not when they are essentially
read-write data. I've dealt with text data for a long time - my background is
scientific data processing up until '96 or so, and I can assure you that it's
just as easy to corrupt text as it is to corrupt binary. I'll grant that the
garbage that results is human-readable in the one case, and not in the other.

In the final analysis, though, why mention a putative "major" problem without
explicitly mentioning a solution? I'm curious.

Regards,
Arved Sandstrom
Received on Sat Oct 21 14:36:27 2006

This is an archived mail posted to the Subversion Dev mailing list.