Re: [PATCH in progress] Pristine text store - pristine_install

From: Neels Hofmeyr <neels_at_elego.de>
Date: Wed, 23 Feb 2011 15:37:55 +0100

On Sat, 2011-02-19 at 04:53 +0100, Stefan Sperling wrote:
> On Fri, Feb 18, 2011 at 09:19:56PM -0500, Greg Stein wrote:
> > Can somebody provide a pointer to some of the latest speed analysis?
>
> Neels is on vacation this week. When he returns, I'll prod him
> about running his performance tests again and sharing the results.

* neels prodded

if my tests are going to be "official", I feel they need some
verification / opinions. Possibly also extension so they test more than
ra_local.

- I run a pseudo-randomized checkout-switch-modify-merge-resolve series
in ra_local only. This emphasizes the timings of lib_wc, so that
additional working copy overhead causes a bad time factor. Example: The
test may spit out a time factor of 2 (twice as slow) even if the network
comm were commonly magnitudes slower and 'real' ra_* access would never
notice such a bad factor.

- On the other hand, if trunk for some reason were needing more ra_
connections than 1.6.x, we won't see that, since ra_local access timing
is negligible.

(Maybe it would be better to talk about added seconds of run time
instead of factors.)

Anyone else keen on forming an opinion on my humble tests? Let's break
it down.

I've got one py script that is able to run N tests for a single svn
build in a specific dir depth / dir spread config, and it writes its
results into a python pickle file.

The results add up the times that each subcommand takes to complete, by
name. E.g. all 'svn update' runs are added up.

Later runs can combine and compare pickle files and print stats.

A bash script calls a series of such svn-version/dir-depth/dir-spread
runs and finally compares the pickle files to print overall stats.

The svn commands run, roughly; these are python functions that call svn
in the way their names suggest:
[[[
run_cmd(['svnadmin', 'create', repos])
svn('checkout', file_url, wc)

      trunk = j(wc, 'trunk')
      create_tree(trunk, levels, spread)
      add(trunk)
      st(wc)
      ci(wc)
      up(wc)
      propadd_tree(trunk, 0.5)
      ci(wc)
      up(wc)
      st(wc)

trunk_url = file_url + '/trunk'
branch_url = file_url + '/branch'

svn('copy', '-mm', trunk_url, branch_url)
st(wc)

up(wc)
st(wc)

      svn('checkout', trunk_url, wc2)
      st(wc2)
      modify_tree(wc2, 0.5)
      st(wc2)
      ci(wc2)
      up(wc2)
      up(wc)

      svn('switch', branch_url, wc2)
      modify_tree(wc2, 0.5)
      st(wc2)
      ci(wc2)
      up(wc2)
      up(wc)

      modify_tree(trunk, 0.5)
      st(wc)
      ci(wc)
      up(wc2)
      up(wc)

      svn('merge', '--accept=postpone', trunk_url, wc2)
      st(wc2)
      svn('resolve', '--accept=mine-conflict', wc2)
      st(wc2)
      svn('resolved', '-R', wc2)
      st(wc2)
      ci(wc2)
      up(wc2)
      up(wc)

      svn('merge', '--accept=postpone', '--reintegrate', branch_url,
trunk)
      st(wc)
      svn('resolve', '--accept=mine-conflict', wc)
      st(wc)
      svn('resolved', '-R', wc)
      st(wc)
      ci(wc)
      up(wc2)
      up(wc)

      svn('delete', j(wc, 'branch'))
      ci(wc)
      up(wc2)
      up(wc)
]]]

Excerpts from the "outer layer" shell script:
[[[

batch(){
  levels="$1"
  spread="$2"
  N="$3"
  pre="${levels}x${spread}_"
  eval "$(pat bashrc)"
  pat use 1.6
  ./benchmark.py run ${pre}1.6_1.runs $levels $spread $N
  ./benchmark.py run ${pre}1.6_2.runs $levels $spread $N
  pat use 1.7
  ./benchmark.py run ${pre}1.7_1.runs $levels $spread $N
  ./benchmark.py run ${pre}1.7_2.runs $levels $spread $N

<combine stats>
<print stats>
]]]

This is a bash function that switches to svn 1.6 (using my humble helper
'pat' [1] to modify the PATH environment), runs the whole test N*2
times, then switches to svn 1.7 and again runs the thing 2N times. It
runs each build twice so that it can also compare two identical runs,
for us to verify whether those timing factors are sufficiently near 1.0.

Then that whole thing is run in three configurations (a: 4x4, b: 100x1,
c: 1x100); meaning how deep the deepest dir tree is ("levels") and how
many child dirs each dir has ("spread"), and that N times.

We can very easily modify these few numbers to choose test run size from
tiny to "infinite".

[[[
N=3
# run a: levels 4, spread 4 (4x4)
al=4
as=4

# run b: levels 100, spread 1 (100x1)
bl=100
bs=1

# run c...
cl=1
cs=100

batch $al $as $N
batch $bl $bs $N
batch $cl $cs $N

<combine stats>
<print overall stats>
]]]

I'd be delighted if anyone else wants to hack this stuff -- with or w/o
me.

~Neels

[1] I wrote pat for myself to take care of repetitive svn devel tasks. I
also use it to maintain several different svn builds alongside each
other, so it's rather large and unreviewed. In this test, pat is only
used to modify the PATH variable towards the 1.6 or the 1.7 build,
respectively. http://hofmeyr.de/code/pat/

text/x-python attachment: benchmark.py

application/x-shellscript attachment: run

Received on 2011-02-23 15:39:06 CET

This message: [ Message body ]
Next message: Stefan Sperling: "initial thoughts on issue #3818"
Previous message: Stefan Sperling: "Re: [Issue 3818] fix handling of externals in wc-ng"
In reply to: Stefan Sperling: "Re: [PATCH in progress] Pristine text store - pristine_install"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]