[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Numbers encoding in FSFS log addressing indexes

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Wed, 25 Jun 2014 18:03:24 +0000

Stefan Fuhrmann wrote on Wed, Jun 25, 2014 at 17:34:43 +0200:
> On Wed, Jun 25, 2014 at 5:09 PM, Ivan Zhakov <ivan_at_visualsvn.com> wrote:
> > Subversion 1.8 and before in general uses human readable decimal
> > format to store numbers in FSFS repositories on disk.
> True. However, there are exceptions to that general rule.
> The index data uses the same basic encoding as we
> already use in txdelta. In both cases, encoding density
> is critical I/O performance.

Is "density" the right word? The density ratio between base-2⁷ encoding
and base-10 encoding is a constant factor, is that constant significant?

Perhaps an ASCII hexadecimal integer would solve whatever the problems
with ASCII decimals are that a txdelta (base-2⁷) integer solves?

> For instance, if you disable deltification in the ruby repo
> (but keeping compression active), it explodes to 9.7GB,
> a factor of 22.8. From that it should be obvious how
> important space efficient encoding is to Subversion.

What does deltification have to do with choosing between ASCII-encoding
and svndiff-encoding of 64-bit integers?

> > Log addressing
> > implementation on trunk introduces new encoding for storing numbers in
> > indexes. Quoting log addressing indexes format documentation [1]
> >
> I'm not even sure there is documentation for our txdelta
> on-disk representation. So, FSFS indexes are doing a
> better job in that department, ATM.

Why is this relevant to the subject at hand? Good job for writing
documentation, but lack of documentation wasn't Ivan's concern.


Received on 2014-06-25 20:04:21 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.