[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: BDB vs FSFS - OMG!

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Mon, 7 Jan 2013 06:16:47 +0100

On Mon, Jan 7, 2013 at 3:44 AM, C. Michael Pilato <cmpilato_at_collab.net>wrote:

> On 01/06/2013 05:27 AM, Branko ─îibej wrote:
> > As Lieven says -- FSFS has been steadily improving while BDB was
> > standing still these last 6 years. IMO, if there were enough users of
> > the BDB back-end to matter, we'd have been given incentive (through bad
> > language on users@ ...) to do more than just keep the back-end limping
> > along to make the testsuite work.
> I haven't really been much considering this "deprecate BDB" question, but
> there are a couple of bits of misleading information in the above that
> should probably be addressed.
> First, it is true that FSFS has been "steadily improving" over the past 6
> years, but I think we should qualify that a bit for the sake of the casual
> reader. FSFS has only improved in the past 6 years in its ability to do
> what it has always done, just faster and without consuming as much disk
> space. We're not talking about bold new features here. Heck, we're not
> even talking about lackluster little features! I certainly don't mean to
> discount the improvements that have been made, of course -- FSFS is a much
> faster, much more stable backend these days (thanks largely to stefan2's
> work).

My impression from Branko's and other people's post
is that the key concerns are:

* current maintenance costs / maintainability
* impact on FSv2 implementation efforts

The point about repo size and performance is a mere
incentive to people that would need to migrate away
from BDB once the latter wasn't supported anymore.
Given that e.g. my log -g improvements were above
the FS layer, I've simply been surprised to see such
*massive* differences in speed. And that despite the
fact that FSFS is still super wasteful / inefficient when
it comes to file operations.

Secondly, and speaking as probably the person most likely to invest any
> energy into the BDB backend now or in the past 6 years (notwithstanding
> Julian's valiant if ill-advised foray into "obliterate"), I want to make
> sure folks understand *why* I haven't (or haven't appeared to). The casual
> reader might get the sense from this thread that development on the BDB
> backend stopped because we were content to let "limp along". But in fact,
> *my* development on the BDB backend stopped because almost every time I
> wanted to add features there, I realized that those features couldn't
> easily
> have first-class implementations in the FSFS backend while still honoring
> the basic tenets of the FSFS design. (Forward history searching via
> successor links comes quickly to mind, but I know I ran into this on more
> than just that front.)

I guess many people (me included) were not aware
of that limiting aspect. Well, the logical addressing
in format7 will be major step towards rewritable
history in FSFS. Combined with the long list of
improvements on my 1.9 todo-list, forward history
should become perfectly feasible.

Feature parity across the backends is beneficial for
> obvious reasons, but that just means that FSFS's immutable-history design
> tenet has been a hindrance to any meaningful feature-type progress for
> itself *and* for any/all other backends. Of course, there are aspects of
> BDB that make certain features infeasible, such as "obliterate", but from a
> design limitations standpoint, FSFS has always been -- and always will be
> --
> the bottleneck.

The interesting question here would be how much
of those new features should be delayed to FSv2.

In the past, I had favored the strategy of postponing
any structural improvement to a new FSFS2. Today,
however, it feels perfectly feasible to bring those
structural changes to FSFS as part of an evolutionary
approach. So, many of the building blocks will be
tried and tested by the time FSFS2 is called for.

> Finally, like stefan2 said, I'd hazard a guess that the lion's share of
> remaining BDB backend users are folks whose Subversion repositories are
> hosted elsewhere for them.

I want to stress the point that Subversion has put an
emphasis on backward compatibility in the past.

Because there will always be some abandoned repo
sitting somewhere, we should at least offer an upgrade
path even in the then-latest SVN release. IMHO, the
equivalent to 'svnadmin dump' should suffice.

The BDB backend (thanks to improvements to the
> Berkeley DB library itself) is much more stable today than it was when we
> first started this project, so it's quite possible that we don't hear noise
> on users@ because a) nobody reports the problems that don't happen, or b)
> they report them to their hosting service instead. That said, I am
> confident that BDB usage is and has been rapidly dwindling. CollabNet
> still
> has customers -- with ginormous repositories and gazoodles of them -- using
> it, but I've lost track of precisely how widespread that is across our
> customer base.

Do you have any number on how big the biggest
repos are?

> Outside of CollabNet, I'd guess most of the smaller projects
> migrated to FSFS long ago ... and most of the larger ones migrated to
> DVCS. :-)

-- Stefan^2.

Certified & Supported Apache Subversion Downloads:
Received on 2013-01-07 06:17:25 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.