[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Proposal to speed up rev-hunting

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2004-09-17 18:20:40 CEST

clkao would like rev-hunting to be faster, because svk has an option
to improve merging which makes intensive use of
svn_repos_trace_node_locations(). And, of course, we could benefit
ourselves from faster rev-hunting for operations which take a peg rev.

We hypothesize that the problem is having to walk the predecessor
chain of the file, which may be rather long, when all we really care
about is the copies.

It turns out that both back ends have enough schema information to
find the last copy applicable to a node-rev without walking the
predecessor chain. So, here is a proposal to do what clkao wants:

  1. Introduce a new API svn_fs_closest_copy:
     Inputs: path, rev
     Outputs: copy_path, copy_rev, remainder

     Returns the destination of the most recent copy to affect
     path@rev. The copy may be of a parent directory; remainder gives
     the portion of path relative to the copy destination. (Maybe
     this API returns the copy source, or maybe you use
     svn_fs_copied_from to get it. Not terribly important.)

     FSFS implementation: Call find_youngest_copyroot(), which is used
     by the history code. Double-check that path@copy_rev exists and
     is related to path@rev, to eliminate the possibility that
     path@rev was created from scratch at some rev between copy_rev
     and rev.

     BDB implementation: The logic used by find_youngest_copyroot()
     can be replicated under BDB (the FSFS "copy root" of a node-rev
     can be obtained in BDB by looking up the node-rev's copy ID in
     the copies table), so it's definitely possible to do the same
     thing under BDB as I proposed for FSFS. There might be a faster
     way based on what the BDB history code does; I don't fully
     understand that code at the moment.

  2. Modify svn_repos_trace_node_locations() to use new API.

     If we run out of copies before we hit one or more of the desired
     location revs, we look up last-copy-source-path@location-rev and
     check to see if it is related to path@rev, to eliminate the
     possibility that last-copy-source-path was created from scratch
     at some rev between location_rev and rev.

If you're dying to become a minor FS guru (or if you're already an FS
guru and have some extra time), this might be a neat project.
Otherwise, I may get to it at some point.

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Sep 17 18:21:01 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.