[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: libsvn_fs_bigtable and general svn problems

From: Blair Zajac <blair_at_orcaware.com>
Date: Wed, 19 Aug 2009 17:25:10 -0700

Ben Collins-Sussman wrote:
> On Mon, Aug 17, 2009 at 5:24 PM, Blair Zajac<blair_at_orcaware.com> wrote:
>> Ben Collins-Sussman wrote:
>>> Hi folks,
>>> As you may have noticed, Google Code's Subversion service is still
>>> pretty slow... still much slower than a stock Subversion/Apache server
>>> running on a single box. It turns out to be tricky to work with
>>> bigtable: you get massive scalability, but in return you have to
>>> convert all of BDB's disk i/o calls into network RPCs. On a single
>>> box, the disk i/o calls get faster over time as the OS eventually ends
>>> up swapping the underlying filesystem into memory. But network RPCs
>>> are slow and stay slow. :-/
>> Ben,
>> Are there any writeups on the specifics of the svn_fs.h to BigTable mapping?
>> How are the paths and node-ids mapped to BigTable's key and columns?
> The original port that fitz and I did was fairly brain-dead: we
> simply forked libsvn_fs_base, and replaced BDB calls with Bigtable
> calls. Instead of BDB managing 10 hashtables on disk, we now had
> Bigtable managing the same 10 "columns" in a single Bigtable.
> It certainly worked, but it was heinously slow. Our whole BDB backend
> assumes that that reads & writes are essentially free. Sure enough,
> any reasonable OS will eventually page the BDB files directly into
> memory and then access *is* essentially free. However, by converting
> these BDB calls to Bigtable network RPCs, we experienced a 10x
> slowdown. And nothing ever makes the network RPCS faster over time.
> :-)
> We eventually got the system up to a slow-but-usable speed through the
> judicious use of gigantic LRU caches. That's what you see today.
> Jon's project, however, is building a completely new implementation --
> one with a bigtable schema designed from scratch, designed to make as
> few Bigtable RPCS as possible. I'm not sure it's safe for me to spill
> all the details of that schema to the public just yet; I may need to
> get an official nod from someone first.


Thanks for the info. Seeing the schema you and Jon come up with would be very


Received on 2009-08-20 02:25:34 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.