Re: FSFS successor ID design draft

From: Hyrum K Wright <hyrum.wright_at_wandisco.com>
Date: Tue, 30 Aug 2011 11:34:34 -0500

I hope I'm not late to the party, but here are a couple of drive-by thoughts.

On Fri, Aug 26, 2011 at 6:14 AM, Stefan Sperling <stsp_at_elego.de> wrote:
> Below is an initial draft of a design for successor-IDs in FSFS.
> It is a bit long, but I hope it covers all relevant points in
> sufficient detail.
>
> Please share your thoughts on this. I have most likely missed a few things.
>
> If we like it I will put it into the repository for further editing,
> either in notes/ on trunk or onto the fsfs-successor ID branch.
>
> If this design has some severe problem I don't mind going back to
> the drawing board. This is my first shot, afterall :)
>
> cmpilato, does this significantly differ from what you've done for BDB?
> Will both backends have the same (or similar) behaviour if we use
> this design for FSFS?
>
> Thanks to neels and danielsh for all your input so far.
> You've been tremendously helpful!
>
> I would also be happy to see alternative proposals.
>
>
>
> FSFS successor IDs
> ==================
>
> See http://svn.apache.org/repos/asf/subversion/branches/fs-successor-ids/BRANCH-README
> for what this is all about.
>
> What follows are ideas about implementing successor ID support for FSFS.
>
> Useful background information:
> Node-revisions: subversion/libsvn_fs_base/notes/fs-history
> subversion/libsvn_fs_base/notes/structure
> fsfs design: subversion/libsvn_fs_fs/structure
>
>
> The "cache" and its properties
> ==============================
>
> Storage of successor IDs is essentially a cache of the result of the
> inversion of the predecessor relation between all node-revisions.
> This inversion is a *very* expensive operation (follow every node
> that ever existed in the repository from its most recent revision
> back to the revision it was created in).
>
> In the following, "cache" refers to the store for successor IDs.
>
> After a general introduction, a proposed cache design is presented.
> Note that the cache implementation will be hidden behind the API of
> libsvn_fs_fs and can be improved or changed in every new minor
> release of Subversion.
>
> Required properties of successor ID cache for FSFS:
> - FSFS-compatible:
> - cache must use some plain file format
> - writers do not block readers (i.e. sqlite is out of the question)
> - cache must not rely on modifying existing revision files
> - cache can be regenerated on demand
> - do not corrupt the cache in case of an unexpected crash
> - try to keep number of directory entries low
> - conserve inodes by design (preferred), or via packing support
>
> Nice-to-have properties:
> - the repository can be live during cache generation
>
>
> Storage size requirements
> =========================
>
> Ballpark figures about node-revs from #svn-dev:
> * cmpilato does some math. There are over 350,000 node revisions in my
> ancient copy of the svn.collab.net Subversion repos.
> <hwright> there are over 20M of them in the ASF repo

Correction: There are roughly 7.2M nodes in the ASF repo (or there
were before the OpenOffice.org import a few days ago).

> How many successors does a given node-rev have?
>
> There is at most one successor on the same line of history (same copy_id).
> Each copy operation also creates one new successor.

I think we need to be bit more clear about when a successor is
actually created in the case of copy. For most copies, particularly
those of the recursive directory kind, a new successor isn't created
until we write to the node. This has some interesting implications.

For instance, folks usually don't modify the contents of a tag, so
there wouldn't be any successor link created for the tag contents.
Even though the tag contents are "logical" successor of their source
files, they aren't actually stored as such. Doing so would make
copying a O(N) operation instead of O(1). (Of course, the O(1)
behavior gives incomplete results when asking "where did this bug move
to?")

Mike may have already worked most of this out on the BDB branch.

> How big will the cache become, in total?
>
> The successor cache does not store entire node-revs, just node-rev IDs.
> These are generally not longer than 40 bytes (less than half a line
> in a 80x25 terminal being the rough estimate).
>
> The amount of raw information to store per node-revision is the
> size of this mapping:
> node_rev_id => successor_id
>
> Each entry is at least 80 bytes large, 40 bytes for the node_rev_id
> and 40 bytes for the successor ID.
>
> Each new node-revision adds one entry to the map.
> Since each successor is itself a node-revision, the entire cache
> has at most as many entries as the amount of node-revs in the
> repository.
>
> More precisely, the amount of successors listed in the cache is
> equal to the number of node revisions in the repository, minus
> the number of nodes in HEAD (which don't presently have successors)
> and minus the number of nodes which were deleted in history (they
> don't have successors either).
>
> In other words, the amount of successors in the cache is equal to
> the amount of pred: entries in revision files.
>
> Let's try to plug in some numbers from the above svn.collab.net case.
>
> There are 350000 node revs.
> maximum size of successor IDs in cache = 40 bytes * 350000
> maximum size of node-rev IDs in cache = 40 bytes * 350000
> maximum size of cache = 40 bytes * 350000 * 2
>
> This amounts to roughly 28MB worst-case successor cache size for
> the svn.collab.net repository (which itself is about 500MB in size).
>
>
> Implementation proposal
> =======================
>...

In reading through this, as well as the discussion in IRC, I'm once
again wondering why we're bolting this stuff onto the outside of FSFS
rather than rethinking the entire FS problem (along with things like
obliterate and move-to storage and ...). I realize we can't do
*everything*, but these feels strangely like libsvn_wc from 5 or 6
years ago. Is there a compelling reason to reinvent the database?

Anyway, I probably won't be doing the actual work, so you can ignore
my comments if you wish, but I figured I'd be remiss if I didn't throw
them out there. :)

-Hyrum

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/

Received on 2011-08-30 18:35:04 CEST

This message: [ Message body ]
Next message: Paul Burba: "Re: Support for reverse merges?"
Previous message: Julian Foad: "Support for reverse merges?"
In reply to: Stefan Sperling: "FSFS successor ID design draft"
Next in thread: C. Michael Pilato: "Re: FSFS successor ID design draft"
Reply: C. Michael Pilato: "Re: FSFS successor ID design draft"
Reply: Neels J Hofmeyr: "Re: FSFS successor ID design draft"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]