Stefan Fuhrmann wrote on Wed, Jun 25, 2014 at 17:34:43 +0200:
> On Wed, Jun 25, 2014 at 5:09 PM, Ivan Zhakov <ivan_at_visualsvn.com> wrote:
>
> > Subversion 1.8 and before in general uses human readable decimal
> > format to store numbers in FSFS repositories on disk.
>
>
> True. However, there are exceptions to that general rule.
> The index data uses the same basic encoding as we
> already use in txdelta. In both cases, encoding density
> is critical I/O performance.
>
Is "density" the right word? The density ratio between base-2⁷ encoding
and base-10 encoding is a constant factor, is that constant significant?
Perhaps an ASCII hexadecimal integer would solve whatever the problems
with ASCII decimals are that a txdelta (base-2⁷) integer solves?
> For instance, if you disable deltification in the ruby repo
> (but keeping compression active), it explodes to 9.7GB,
> a factor of 22.8. From that it should be obvious how
> important space efficient encoding is to Subversion.
>
What does deltification have to do with choosing between ASCII-encoding
and svndiff-encoding of 64-bit integers?
>
> > Log addressing
> > implementation on trunk introduces new encoding for storing numbers in
> > indexes. Quoting log addressing indexes format documentation [1]
> >
>
> I'm not even sure there is documentation for our txdelta
> on-disk representation. So, FSFS indexes are doing a
> better job in that department, ATM.
Why is this relevant to the subject at hand? Good job for writing
documentation, but lack of documentation wasn't Ivan's concern.
Cheers,
Daniel
Received on 2014-06-25 20:04:21 CEST