On Thu, Jun 19, 2014 at 11:38 PM, Branko Čibej <brane_at_wandisco.com> wrote:
>  On 19.06.2014 17:06, Stefan Fuhrmann wrote:
>
>   Turn out that the ruby repo is something special
>  in that it has very deep histories of relatively few,
> very small files combined with one huge changelog
> file (the latter taking up ~75% of the repo). See
>  below for details.
>
> Also, please note that your exports contained
>  >500000 files. Using 16MB of cache with that
>  project size *may* not be an adequate setup.
>  Upping that to insane 256MB (roughly what 1.6
>  would use anyway), gives much better numbers.
>  However, there is hardly a difference between
>  f6 and f7 in these runs.
>
>
>
> Heh, this sound suspiciously like saying that one has to have the right
> test data to make v7 faster than v6. :)
>
Sure. As far as I can see, we always end up reading
most of the repository in this specific case as there
are virtually no entirely cool data blocks. It's only ~40
pack files and most files seem to be present since early
in the project -> most paths need to be read from almost
all pack files.
Since the whole repo is also quite small on disk (~400MB),
everything ends up in disk cache and the reading order
is of little importance. The BSD repo OTOH, has 20x
as much data in their "trunk" ("/head") and the history
is 5x deeper. Plus lots of branching and merging and
f7 reorg-on-pack helps quite a bit.
-- Stefan^2.
Received on 2014-06-20 01:55:48 CEST