[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

cvs2svn (was: Re: Initial version of svntest regression test framework is ready.)

From: Greg Stein <gstein_at_lyra.org>
Date: 2001-04-15 21:53:03 CEST

On Sun, Apr 15, 2001 at 10:08:13AM -0500, Ben Collins-Sussman wrote:
>...
> I don't think you need to be concerned about what language cvs2svn is
> written in; in fact, I don't think we need to worry at all about any
> of the things in the 'tools' directory. My understanding is that
> things in this directory are meant to be treated like extra 3rd-party
> add-ons, each an independent sub-project. (Notice that our top-level
> Makefile.am doesn't even attempt to build things in there.)
>
> A Subversion Hacker will be expected to hack on the libraries and API
> tests in C, and occasionally write a black-box test in a scripting
> language. But I would hope that cvs2svn wouldn't need regular hacking
> -- it shouldn't be part of the hacker's "core" svn repetoire.

Wherever cvs2svn happens to sit, it is part of our 1.0 release and is a
fully supported script. Converting CVS repositories is an absolute "must
have" feature. That implies some kind of testing.

I intend to work on cvs2svn at some point. Looking over what kbob did, I
think that we may need to start again. To enable us to convert really large
repositories within any reasonable time, we've got to directly parse the
files, and we've got to use the libraries (rather than the svn cmdline).
Large repositories also means using algorithms that avoid keeping everything
in memory.
(kbob's script uses the cvs and svn cmdlines, and keeps info in memory; a
 fine design (keeps things vastly simpler), but it'll break down on large
 repositories)

[ for large repositories, I'm considering the apache.org repository at 2G at
  61287 files, the RedHat "comptools" repository at 9G (?), and the
  SourceForge repository (got the numbers, but forgot to ask whether they
  are private/public; it's safe to say it is *way* larger) ]

So... I'd like to do the cvs2svn in Python using some RCS file parsing
scripts that I already have (from the ViewCVS project). Mix that with the
SWIG bindings for Python to libsvn_fs (and the upcoming _repos) library, and
we should have a converter. The guys at VA Linux have said that I could get
an account on one of their boxes to do testing against their big-ass
repository. hehe... why not go for the hardest one? :-) (seriously, though,
I'll test here at home, too)

Note: choosing another language for the script is fine, but that does imply
somebody also has to work on bindings. That means we'd need a person with
two types of experience: the SVN library API, and writing extensions for the
script language. Oh, and knowing about RCS parsing is a "kind of"
requirement (there are both Perl and Python modules to do this, so the bar
is lowered for the parsing part).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/
Received on Sat Oct 21 14:36:28 2006

This is an archived mail posted to the Subversion Dev mailing list.