[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Numbers encoding in FSFS log addressing indexes

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Wed, 25 Jun 2014 18:03:24 +0000

Stefan Fuhrmann wrote on Wed, Jun 25, 2014 at 17:34:43 +0200:
> On Wed, Jun 25, 2014 at 5:09 PM, Ivan Zhakov <ivan_at_visualsvn.com> wrote:
>
> > Subversion 1.8 and before in general uses human readable decimal
> > format to store numbers in FSFS repositories on disk.
>
>
> True. However, there are exceptions to that general rule.
> The index data uses the same basic encoding as we
> already use in txdelta. In both cases, encoding density
> is critical I/O performance.
>

Is "density" the right word? The density ratio between base-2⁷ encoding
and base-10 encoding is a constant factor, is that constant significant?

Perhaps an ASCII hexadecimal integer would solve whatever the problems
with ASCII decimals are that a txdelta (base-2⁷) integer solves?

> For instance, if you disable deltification in the ruby repo
> (but keeping compression active), it explodes to 9.7GB,
> a factor of 22.8. From that it should be obvious how
> important space efficient encoding is to Subversion.
>

What does deltification have to do with choosing between ASCII-encoding
and svndiff-encoding of 64-bit integers?

>
> > Log addressing
> > implementation on trunk introduces new encoding for storing numbers in
> > indexes. Quoting log addressing indexes format documentation [1]
> >
>
> I'm not even sure there is documentation for our txdelta
> on-disk representation. So, FSFS indexes are doing a
> better job in that department, ATM.

Why is this relevant to the subject at hand? Good job for writing
documentation, but lack of documentation wasn't Ivan's concern.

Cheers,

Daniel
Received on 2014-06-25 20:04:21 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.