Daniel Rall <dlr@collab.net> wrote on 10/25/2006 02:55:09 PM:
> I've created a temporary branch, "ood-status-info", so we have a place
> to work on refining these fixes.
>
> On Fri, 13 Oct 2006, Paul Burba wrote:
<snip>
> > Our remaining issues(?):
> >
> > A) The performance of svn_repos_deleted_rev()
>
> This is currently too slow to integrate into trunk. Let's write a
> faster implementation -- hopefully we can speed things up by moving
> the logic down inside the FS layer.
I think I solved this performance problem, not in FS, but with a different
implmentation of the new function svn_repos_deleted_rev().
A quick recap the existing performance problem (in r22118 of
svn/branches/ood-status-info):
* Start with a standard greek-tree test (fsfs) repos.
* Delete /A/D/H/omega in r2.
* Make a whole slew of (50,000+) of text mods to the other files.
* Check out r1 via file://
* Run "time svn st -u -v" in Cygwin
r22118 of svn_repos_deleted_rev() implementation takes almost 5 minutes on
my machine:
$ time svn st -v -u SVN/stat-tests-1/
* 1 1 jrandom SVN\stat-tests-1\A\B\E\beta
* 1 1 jrandom SVN\stat-tests-1\A\B\E\alpha
1 1 jrandom SVN\stat-tests-1\A\B\E
* 1 1 jrandom SVN\stat-tests-1\A\B\lambda
1 1 jrandom SVN\stat-tests-1\A\B\F
1 1 jrandom SVN\stat-tests-1\A\B
* 1 1 jrandom SVN\stat-tests-1\A\D\G\pi
* 1 1 jrandom SVN\stat-tests-1\A\D\G\rho
* 1 1 jrandom SVN\stat-tests-1\A\D\G\tau
1 1 jrandom SVN\stat-tests-1\A\D\G
* 1 1 jrandom SVN\stat-tests-1\A\D\H\omega
1 1 jrandom SVN\stat-tests-1\A\D\H\psi
* 1 1 jrandom SVN\stat-tests-1\A\D\H\chi
1 1 jrandom SVN\stat-tests-1\A\D\H
* 1 1 jrandom SVN\stat-tests-1\A\D\gamma
1 1 jrandom SVN\stat-tests-1\A\D
* 1 1 jrandom SVN\stat-tests-1\A\mu
1 1 jrandom SVN\stat-tests-1\A\C
1 1 jrandom SVN\stat-tests-1\A
* 1 1 jrandom SVN\stat-tests-1\iota
1 1 jrandom SVN\stat-tests-1
Status against revision: 53600
real 4m58.004s
user 0m0.015s
sys 0m0.031s
The slowness is due to svn_repos_deleted_rev's linear search that starts
at the last rev in the search range and works backwards, looking for the
last time a path existed. In the above scenario this takes a long time
because a 50k rev files need to be opened. At the time I'd thought the
linear was the only way to go because a path might be deleted, re-added
and deleted many times, so a binary search wouldn't work. But as Mark
pointed out to me, we don't really care if a path of the same name was
later added then deleted, that is for the path's parent to care about in
it's ood info. Even if a path is moved within the same directory this
still holds.
Looking at the problem in this new light, I used svn_fs_node_id() and
svn_fs_compare_ids() to reimplement svn_repos_deleted_rev() as a
non-recursive binary search and committed it to the ood branch in r22139.
The change from linear to logarithmic running time has the expected
results, the 5 minute operation now takes less than a second:
$ time svn st -u -v SVN/stat-tests-1
* 1 1 jrandom SVN\stat-tests-1\A\B\E\beta
* 1 1 jrandom SVN\stat-tests-1\A\B\E\alpha
1 1 jrandom SVN\stat-tests-1\A\B\E
* 1 1 jrandom SVN\stat-tests-1\A\B\lambda
1 1 jrandom SVN\stat-tests-1\A\B\F
1 1 jrandom SVN\stat-tests-1\A\B
* 1 1 jrandom SVN\stat-tests-1\A\D\G\pi
* 1 1 jrandom SVN\stat-tests-1\A\D\G\rho
* 1 1 jrandom SVN\stat-tests-1\A\D\G\tau
1 1 jrandom SVN\stat-tests-1\A\D\G
* 1 1 jrandom SVN\stat-tests-1\A\D\H\omega
1 1 jrandom SVN\stat-tests-1\A\D\H\psi
* 1 1 jrandom SVN\stat-tests-1\A\D\H\chi
1 1 jrandom SVN\stat-tests-1\A\D\H
* 1 1 jrandom SVN\stat-tests-1\A\D\gamma
1 1 jrandom SVN\stat-tests-1\A\D
* 1 1 jrandom SVN\stat-tests-1\A\mu
1 1 jrandom SVN\stat-tests-1\A\C
1 1 jrandom SVN\stat-tests-1\A
* 1 1 jrandom SVN\stat-tests-1\iota
1 1 jrandom SVN\stat-tests-1
Status against revision: 53600
real 0m0.340s
user 0m0.015s
sys 0m0.015s
Anyone see any problems with this approach?
Paul B.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Oct 27 20:57:48 2006