[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: cvs2svn

From: Greg Stein <gstein_at_lyra.org>
Date: 2001-04-17 00:56:25 CEST

On Mon, Apr 16, 2001 at 02:58:11PM -0700, Bob Miller wrote:
> Sort by time as primary key. You want to build the SVN repository in
> chronological order, anyway. As you traverse the sequence of CVS
> commits in chrono order, group those which match the grouping
> heuristic into a single SVN commit.

Hmm. I was thinking "sorting by hash (with time as a secondary sort key)
groups the commits automatically, and avoids interleaved groups." But if I
do that, then I end up with the time stuff all over the map.

Changing to sort by time first solves that, at the expense of tracking N
groups at a time.

Oh wait. If an interleave of two sets occurs, then you don't know when you
have the complete group. You need some kind of lookahead to decide.


Well, some heuristics can deal with the lookahead. Stuff like:

* if the duration from CURRENT to the previous file is larger than D1, then
  close all open sets

* if the duration from CURRENT to a set's latest file is larger than D2,
  then close that set.

* remember N sets after closing, if a hit occurs (optional: within time T),
  then issue a warning. (optional: flush the remembered-list for sets closed
  prior to T)

Note that we can easily generate statistics from various repositories (min,
max, avg, needed N, D, T, etc) and code those up as defaults. Default to an
average, warn up to the observed max. Ignore "sets" that span more than the
max observed times.

Analysis of the intervals between file revisions can also be important. If
Jane commits 10 files in the space of a second, then another 10 (with the
same message) one minute later, then we probably ought to view them as
separate commits. That "one minute" shouldn't be considered variance. IOW,
compute a files/sec for a commit, and then toss any file that occurs outside
of an expected range.


Greg Stein, http://www.lyra.org/
Received on Sat Oct 21 14:36:28 2006

This is an archived mail posted to the Subversion Dev mailing list.