[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

The future of FSFS and FSX

From: Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com>
Date: Tue, 2 Jul 2013 15:52:10 +0200

We talked about that in Berlin but since then I still have been
pondering the question of where FSFS improvements should go and
I changed my mind more than once. Now I think, I have a consistent
and workable answer.

Historically, the fsfs-format7 branch was destined to 'fix all
that is "wrong"' with FSFS-f6. As I went analyzing and addressing
issues, more things kept coming up and I found solutions that
work reasonably well. Some are already implemented, others won't
for a while.

All that lead to a point where FSFS-backward compatibility is much
more of a burden and stability risk than given a tangible benefit.
The relevant code is still there - despite ripping stuff from FSFS
and FSX backends. However, there never was a time at which an
FSFS-f7 with "just the right amount of improvement" - as outlined
below - ever existed.

Thus, the following strategy:

* Get the fsfs-f6 compatible refactorings and improvements to /trunk.
  That requires no format bump and the current state of FSFS on
  the fsfs-format7 branch will be the blueprint.

* After that, create a branch for what fsfs-f7 can feasibly be.
  Most of that code can be found right at the point where I
  forked FSFS and FSX. Manually applying changes will be required
  for review anyway:

  - based on fsfs-f6
  - add support for logical addressing
  - data alignment and block read
  - pack() reorders data on disk

  TODO but not very complicated:

  - support of mixed addressing repositories, i.e. allow upgrade to f7
  - review existing tools (e.g. fsfs-stats) to handle logical addressing
  - nice to have: bump short-term cache hit rates to > 99.9%

  Maybe backported from FSX in future release:

  - prefetch daemon

  Gains already demonstrated: 3x speedup in c/o, 10x speed up in
  merge-info evaluation, ~100x speedup in log (YMMV)

* FSX is a third backend (alongside FSFS and BDB) that will keep
  its EXPERIMENTAL state for at least 2 releases. That implies
  that there will be no direct upgrade paths between those releases.
  Users may use it in read-only mirrors they already have for
  analyzing large repositories. This is where FSX should excel
  from the start.

  There is a fair chance the FSX will be the first implementation
  of FS2 such that the long-term upgrade path would be from
  [FSFS,BDB] -> [FSX]. The key is to "always" have a fully
  functional implementation instead of starting from ground up.

  Features include (more will be added when we start designing FS2):

  - logical addressing, block read, pack() reordering
  - replace fixed-window txdelta with variable-window txdelta2
  - replace txdelta windows with star-delta containers
  - replace the reps / noderev / changes items with much lower
    overhead containers. This is the point where backward compat
    begins to hurt very much.
  - staged packing (more and more revs per pack file)
  - enhanced change list info that allows for "loggy" ops to
    be run without touching tons of directory reps
  - fully checksummed
  - prefetch deamon
  - in-memory transactions (may require FS2)

  Some design goals:

  - ~50% reduction in repo size. At par with or better than git.
  - space to represent a merged node: ~100 bytes typ.
  - commit speed >1000 revs/s
  - verification in O(repo size), ~1GB/s
  - log, log -g, merge and c/o at > 100MB/s from cold start

I hope that makes sense to most of you.

-- Stefan^2.
Received on 2013-07-02 15:52:44 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.