Stefan Fuhrmann <stefanfuhrmann_at_alice-dsl.de> writes:
> On 27.09.2010 21:44, Ramkumar Ramachandra wrote:
>> Could you tell me which tools you use to profile the various
>> applications in trunk? I'm looking to profile svnrdump to fix some
>> perf issues, but OProfile doesn't seem to work for me.
>
> Under Linux, I'm using Valgrind / Callgrind and visualize in KCachegrind.
> That gives me a good idea of what code gets executed too often, how
> often a jump (loop or condition) has been taken etc. It will not show the
> non-user and non-CPU runtime, e.g. wait for disk I/O. And it will slow the
> execution be a factor of 100 (YMMV).
The performance of svnrdump is likely to be dominated by IO from the
repository, network or disk depending on the RA layer. strace is a
useful tool to see opens/reads/writes. You can see what order the
calls occur, how many there are, how big they are and how long they
take.
Valgrind/Callgrind is good and doesn't require you to instrument the
code, but it does help to build with debug information. It does
impose a massive runtime overhead.
OProfile will give you CPU usage with far lower overhead than
valgrind/callgrind. Like valgrind/callgrind you don't need to
instrument the code but it works better with debug information and
with modern gcc if you use -O2 then you need -fno-omit-frame-pointer
for the callgraph stuff to work. I use it like so:
opcontrol --init
opcontrol --no-vmlinux --separate=library --callgraph=32
opcontrol --start
opcontrol --reset
subversion/svnrdump/svnrdump ...
opcontrol --stop
opcontrol --dump
opreport --merge all -l image:/path/to/lt-svnrdump
This is what I get when dumping 1000 revisions from a local mirror of
the Subversion repository over ra_neon:
CPU: Core 2, speed 1200 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (Unhalted core cycles) count 100000
samples % app name symbol name
4738 41.1893 no-vmlinux (no symbols)
1037 9.0150 libxml2.so.2.6.32 (no symbols)
700 6.0854 libneon.so.27.1.2 (no symbols)
238 2.0690 libc-2.7.so _int_malloc
228 1.9821 libc-2.7.so memcpy
221 1.9212 libc-2.7.so memset
217 1.8865 libc-2.7.so strlen
191 1.6604 libsvn_subr-1.so.0.0.0 decode_bytes
180 1.5648 libc-2.7.so vfprintf
171 1.4866 libc-2.7.so strcmp
153 1.3301 libapr-1.so.0.2.12 apr_hashfunc_default
134 1.1649 libapr-1.so.0.2.12 apr_vformatter
130 1.1301 libapr-1.so.0.2.12 apr_palloc
That's on my Debian desktop. At the recent Apache Retreat I tried to
demonstrate OProfile on my Ubuntu laptop and could not get it to work
properly, probably because I forgot about -fno-omit-frame-pointer.
Finally there is traditional gprof. It's a long time since I used it
so I don't remember the details. You instrument the code at compile
time using CFLAGS=-pg. If an instrumented function foo calls into a
library bar that is not instrumented then bar is invisible, all you
see is how long foo took to execute.
--
Philip
Received on 2010-09-28 19:29:16 CEST