Re: FSFS successor ID design draft

From: Daniel Shahaf <danielsh_at_elego.de>
Date: Mon, 5 Sep 2011 14:09:38 +0300

Stefan Sperling wrote on Mon, Sep 05, 2011 at 12:52:10 +0200:
> On Mon, Sep 05, 2011 at 01:23:14PM +0300, Daniel Shahaf wrote:
> > Stefan Sperling wrote on Mon, Sep 05, 2011 at 11:38:11 +0200:
> > > So you're saying that we should run the plaintext proposed above
> > > through svndiff? Can you explain in more detail how this would work?
> > > What is the base of a delta?
> > >
> >
> > The file contains one or more DELTA\n..ENDREP\n streams:
> >
> > DELTA
> > <svndiff stream>
> > ENDREP
> > DELTA
> > <svndiff stream>
> > ENDREP
> >
> > (In second thought, we should be storing the length of the stream
> > somewhere; on the DELTA header seems a fine place:
> >
> > DELTA 512
> > <512 bytes of svndiff stream>
> > ENDREP
> > DELTA 37
> > <37 bytes of svndiff stream>
> > ENDREP
> >
> > .) When the file is read, readers decode all the deltas and concatenate
> > the resulting plaintexts. When the file is rewritten, writers
> > optionally combine the first N deltas into a single delta that produces
> > the combined plaintext.
> >
> > The deltas can be self-compressed (like a DELTA\n rep in the revision
> > files), ie, having no base.
>
> OK, I see. You're trying to save disk space, trading it for CPU time
> during read/write operations. Does that make sense? Is the amount of
> data really going to be big enough to be worth it?
>

When you phrase it that way: I doubt it.

You suggested a 'more efficient storage model', so I remarked about one
idea that crossed my mind. Obviously there are others, such as grouping
the storage by LHS of the mapping, rather than scattering map entries
with the same LHS all over the file.

> > > What is 'lhs'?
> > lhs = left-hand side
> > rhs = right-hand side
>
> > How about calling them after ths RHS'es of the mappings rather than
> > after the fact that they are mappings?
> >
> >
> > Currently:
> >
> > - noderev map file, revision map file, successors data file
> >
> > Perhaps:
> >
> > - noderev posterity file, successor offsets file, successors data file
>
> These names are fine with me.
>
> What would you call them on disk?
>

/successors/progeny/M
/successors/offsets/N
/successors/data/N

where M and N are the oldest revision number of their respective shards.
(So, for example, M = 0, 1000, 2000, 3000... )

Can probably improve on those names a bit...
Received on 2011-09-05 13:15:38 CEST

This message: [ Message body ]
Next message: Johan Corveleyn: "Re: question for FSFS gurus (was: Re: FSFS successor ID design draft)"
Previous message: Julian Foad: "Re: [RFC] Make "svn switch" say "Switched" instead of "Updated" or "At""
In reply to: Stefan Sperling: "Re: FSFS successor ID design draft"
Next in thread: Stefan Sperling: "question for FSFS gurus (was: Re: FSFS successor ID design draft)"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]