[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Backward or forward deltas, backend, FSX

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Sat, 17 Feb 2018 13:26:57 +0000

Péter wrote on Fri, 16 Feb 2018 23:52 +0100:
> An additional minor question: when making "big" (skip-) deltas from
> smaller deltas: does not this mean that for many
> changes (rev[N] -> rev[N+1]) this change will be included (redundantly)
> in many deltas?

Yes.

> So, we gain some speed - but at the cost of size.

Also at the advent of recoverability. If r3 of a file is corrupted,
then the contents could probably be recovered by diffing r2 to r4 and
consulting the log messages and the developers who maintain the
subject file.

> It is against the "store every piece [fragment?] only once" principle.

Also known as https://en.wikipedia.org/wiki/Don%27t_repeat_yourself

> (And it is not fully clear for me why would be [lot?] quicker playing
> the same A + B changes as 1 delta, than playing as
> 2 deltas? Aside from the [rare] cases when change B does some "opposite"
> of A.)

In short, I suppose it's because deltas are typically much smaller than
the files they modify.

Suppose you have a 1MB file and two successive commits each change one
line therein. The deltas would be above 100 bytes each. Combining the
deltas would operate on 200 bytes of input. Applying a delta to that
file would operate on 1.000100MB of input. If you apply two deltas to
that file, you'll have operated on 2.000200MB in total (ignoring the
fact that the update isn't done in-place). If you combine the deltas
before applying them, you'll have operated on 1.000400MB of input (200B
in the delta combiner and 1.000200MB in the delta applier).

> Daniel Shahaf:
> > FSFS and FSX are designed around the assumption that the storage backing
> > older revisions is immutable.
>
> Older revisions (older than the very last revision) can be kept read-
> only in both cases, I think. (What am I missing?)
>

In both FSFS and FSX, *all* revisions that have already been committed
can be (and, in recent releases, are) read-only.

> > max-linear-deltification (see fsfs.conf) is 16, meaning that no fulltext
> > will require 17 delta applications to produce.
>
> (Okay, the number of applicated deltas was reduced, but not the amount
> of changes. The whole (or half?) of the complete
> "life" of some file may be re-played, just to yield the current state.
> Analogue: for an animal (for example, frog):
> "let's start with this single cell; then apply changes A,
> then.......................... here is the frog".)

Again, this is a time/space trade-off. The user controls the trade-off
by setting the value of max-linear-deltification. If you often use fixed-
size files that get constantly rewritten, lowering the value might
result in better performance for your workflow. On the other hand, if
you store frogs in Subversion, the 16 deltas will consist mostly of
"add" svndiff0 instructions, and the overall performance will be
comparable to that of slurping a file that had been fragmented (at the
filesystem level) to 16 different contiguous blocks on disk, which
happen to be sorted optimally for the reader/writer head's platter
scanning order (as though the file was created by 16 append operations).

Cheers,

Daniel
Received on 2018-02-17 14:27:11 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.