On Mon, May 18, 2015 at 1:20 AM, Daniel Shahaf <d.s_at_daniel.shahaf.name>
> Good morning Stefan,
> A couple of questions on the fsfs7 indexes doc:
> --- subversion/libsvn_fs_fs/structure-indexes
> +++ subversion/libsvn_fs_fs/structure-indexes
> @@ -18,6 +18,9 @@ a simple concatenation of runtime structs and as such,
> an implementation
> detail subject to change. A proto index basically aggregates all the
> information that must later be transformed into the final index.
> +### "Subject to change?" The format must be stable & documented since
> +### "begin txn, modify txn, reboot, upgrade svn, commit txn" should work.
That's an outdated section. All index structures,
including the proto-indexes are now well-defined
> General design concerns
> @@ -346,7 +349,10 @@ For performance reasons we use a modified version:
> * combine the big endian representation of these checksums plus the
> remnant of the original stream into a 12 to 15 byte long intermediate
> [i0 .. iK], 12 <= K+1 <= 15
> +### "combine" is unclear. Combine how? Concatenate? Interleave? Add?
Concatenated into a 16 to 19 byte sequence
(so there is also a factual error).
> * FNV checksum = fnv_1a([i0 .. iK]) in big endian representation
> +### Why do we checksum the output of a checksum algorithm? (Bytes i0..iK
> are themselves FNV output)
Not all are FNV output, they also include up to
3 "odd" bytes from the end of the original sequence.
But the main reason is that we want a short / space
efficient checksum to keep the index size small.
A 1-in-a-billion false negative for a random bit corruption
is good enough for a quick low-level check. It is on par
with what e.g. BTRFS does.
> Also, I suggest to change "zero to many" to "zero or more" to avoid
> with the term "one-to-many" of relational databases.
Yeah, that should be changed as well.
Thanks for the review! r1692739 addresses the first
3 points and r1692964 the last one.
Received on 2015-06-01 12:40:03 CEST