I'd like to implement 'svn-bench null-blame', to measure the
server-side (and network) involved in blame (mainly get_file_revs2).
I started by copying blame-cmd.c to null-blame-cmd.c, but then I
realized that most client-side work during 'blame' is actually done in
the client layer, i.e. blame.c (creating the full-texts out of the
received deltas, and calling the diff command for each couple).
So how should I go about that, to make the client side do "null work"
(or close to that)? Should I incorporate the needed blame.c part into
null-blame-cmd.c, to "nullify" it?
The motivation for doing this is the following: last week, during the
hackathon, danielsh implemented reverse blame (aka kidney blame). I
used this on my "huge xml file with lots of revisions" to see if it
would be faster than the normal blame (I expected it to be faster,
because the server would serve the deltas immediately while walking
history backwards, making the client start blame calculation (diffs)
almost immediately).
To my surprise, it was slightly slower. I suspect this is because the
backend (FSFS) stores forward deltas (sometimes with skip-deltas),
which can often be used directly without much overhead, and sent
directly to the client. But when walking get_file_revs backwards,
every delta needs to be calculated by the server (no shortcuts). I.e.
serving deltas from youngest to oldest is more expensive for the
server.
It'd be nice to measure this effect a bit more objectively, hence my
desire for null-blame.
(in the end, I think backwards walking could still be a good
optimization for blame, because in normal setups the client (diffing)
will still be the bottleneck. But when I'm running both client and
server on my 6 year old laptop, it can be expected that the server
also causes some slowdown)
--
Johan
Received on 2013-06-18 00:32:00 CEST