Hi all,
Getting back to this oldish thread ...
I have re-analyzed my old test figures, and done some additional
tests. It seems I was mistaken with some of my conclusions (pointing
to I/O instead of CPU), so apologies to Stefan^2 and others for
"wasted cycles" in this discussion. I see now that you are mostly
correct, Stefan, by focusing on optimizing for CPU usage, and I hope
you can save as many cycles as possible ;-).
The only use case where I definitely see a server-I/O-boundedness is
log (see below). All the other actions I tested (update, checkout,
blame) were more sensitive to CPU than to I/O. I thought blame was
also server-I/O-bound, but it is not (it is both client-side and
server-side CPU bound).
I tested by comparing 5 setups (2 machines, different storage):
1) Sun Sparc box (Sol 10), 32 x 1.2 GHz, svn 1.5.4, FSFS back-end on
SAN/NFS (current production setup)
2) Sun Sparc box (Sol 10), 32 x 1.2 GHz, svn 1.5.4, FSFS back-end on
local 10k disk
3) x86 box (Win Vista), 3.0 GHz Core2Duo, svn 1.6.9, FSFS back-end
unpacked on local 10k disk
4) x86 box (Win Vista), 3.0 GHz Core2Duo, svn 1.6.9, FSFS back-end
packed on local 10k disk
5) x86 box (Win Vista), 3.0 GHz Core2Duo, svn 1.6.9, FSFS back-end
packed on local SSD disk
(the machine used for 3), 4) and 5) is not really a proper server (not
suitable for huge loads, lots of concurrent requests etc, ...), but I
just used it to compare some things)
By comparing those 5 setups, I could see which had the most impact for
a particular use case: changing storage for the same machine, or
changing to a different cpu/OS, same storage. A crude way to test, but
still it gave me some indications.
Blame:
- Almost no difference between 1) and 2).
- Huge difference (4 or 5 times faster) by switching to setups 3), 4)
or 5) (not much difference between them).
- Conclusion: cpu-bound
Log:
- Big difference between 1) and 2) (2,4 times faster)
- 3) and 4) perform almost identically to 2).
- Huge difference between 2),3),4) on the one hand, and 5) on the
other (SSD almost 4 times faster than 10k disk).
- Conclusion: I/O-bound
About log, some inline replies below ...
On Sun, May 16, 2010 at 12:29 PM, Stefan Fuhrmann
<stefanfuhrmann_at_alice-dsl.de> wrote:
[snip]
> Johan Corveleyn wrote:
[snip]
>> I mainly focused on log and blame (and checkout/update to a lesser
>> degree), so that may be one of the reasons why we're seeing it
>> differently :-) . I suppose the numbers, bottlenecks, ... totally
>> depend on the use case (as well as the hardware/network setup).
>>
>
> The log performance issue has been solved more or less
> in TSVN. In 1.7, we also brought the UI up to speed
> with the internals: even complex full-text searches over
> millions of changes are (almost) interactive.
Yeah, I know TSVN has worked around the slowness of log, and that's
great. But still, I would like log to be fast in svn core as well.
Sometimes a build script or whatever needs to retrieve some log info
with the CLI, or your particular IDE integration doesn't have the
option of only asking the last 100 log entries, and caching stuff
client-side etc. And besides, maybe it would make TSVN's log even
faster :-).
> To speed up log on the server side, you need to maintain
> an index. That's certainly not going to happen before fs-ng.
> Otherwise, you will always end up reading every revision file.
> Only exception: log on the repo root with no changed path
> listing.
Yes, the best solution would indeed be more/better indexing in the
back-end storage, and I understand that will not happen anytime soon.
However, even with the current proverbial "full table scan",
improvements can be made I think. Right now, I have the impression
that svn "scans the table" about 5 times (see old threads [1] and [2],
and recent discussion [3]). A single table scan should suffice :-).
But that's probably for another thread...
[1] http://svn.haxx.se/dev/archive-2009-06/0459.shtml
[2] http://svn.haxx.se/dev/archive-2007-08/0239.shtml
[3] http://svn.haxx.se/dev/archive-2010-05/0153.shtml (ignore the
performance figures in that thread for all actions except for log;
this was basically a comparison between setups 1) and 5) from the
above list, which is not so smart considering they differ both in
storage and in CPU/architecture/OS/...)
Cheers,
--
Johan
Received on 2010-06-02 16:31:29 CEST