[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: incremental conversion from other SCM to svn by vcp

From: Barrie Slaymaker <barries_at_slaysys.com>
Date: 2003-06-17 15:21:58 CEST

[Sorry for the delayed replies, was out of town & offline, thankfully :]

On Sat, Jun 14, 2003 at 01:40:29AM +0800, Chia-liang Kao wrote:
> On Fri, Jun 13, 2003 at 11:21:35AM -0500, kfogel@collab.net wrote:
> > - The subversion dest driver for VCP handles branches pretty well,
> > because VCP itself already does the branch-point finding. So
> > branches are created as copies, with the appropriate files added
> > or removed when necessary. The revision history of your sample
> > repository at http://svn.openfoundry.org/svn/sympa/ shows this.
>
> actually the vcp core does only per-file branch point deduction. but
> the revision in svn semantic (svn cp trunk -r <which>) for branching
> is decided by the create_branch function in VCP::Dest::svn i wrote.

VCP::Source::cvs does detect branch creation and marks it with
"placeholder" revisions (no delta, the rev id is the madig branch
number, like "1.1.2.0", etc). This is so that CVS branches that do not
contain any changed files still result in a branch in the destination
repository and so that, in systems like perforce, the branch of all
files in a new branch can occur in a single operation. The changeset
aggregator should put all the branch founding placeholders in a single
changeset so the VCP::Dest::foo can do the branch as a single operation.

> > - But it doesn't handle tags, because VCP doesn't deduce the tag
> > points in the same way. Instead, it just marks the tags per file
> > revision... which doesn't help us much in Subversion.
>
> it shouldn't be hard to implement that. since the deduction is pretty much
> like I did in branching point: decide the `global point' from from points of
> every files.

Can somebody point me to a deep description of what svn means by a tag?

> > - The conversion time seems a bit slow to me (7 hours for 2000 svn
> > revs with four branches). Extrapolating from cvs2svn.py's
> > performance right now, I think it would do that in 10 minutes at
> > the most. But perhaps there are optimizations you are planning?
>
> as i saw in the profiling from vcp log, svn commit takes some time, the
> longest is 20 sec or so for one large commit. but the bottle neck right
> now is how it extracts every revision from cvs: doing cvs checkout -r
> <revision> <onefile> for every file. i'll be implementing fast retrieval
> of cvs by setting date tag and verifying the resulting revision, hopefully
> this would boost conversion time. but more importantly is that the conversion
> is incremental, so even if the very first conversion of a large repository is
> slow, subsequent conversion of newly committed files won't take long.

Try also the direct read of the source files. I'd like to take the RCS
file parser and have it cache (on disk) reversed deltas from the head
back to the oldest revision retrived, then apply these reversed deltas
as it "cvs checkout"s each new revision. This will prevent it from
spawning cvs each time (ugh), and will make it more efficient because it
can apply the patches in a going-forward direction.

VCP::Source::revml does the roll-forward-and-patch operation already,
using VCP::Patch (a limited all-perl, and thus slower but \000 safe
patch routine), so the remaining operation here is to reverse any
previously unreversed patches as a checkout is simulated and
store them on disk.

This should be lots faster than spawning CVS kids, and hopefully even
faster and less errorprone than using the -d$DATE command. But it would
only apply to CVSROOTs on the local fs, do the -d$DATE optimization you
discuss would be very nice for :pserver: and :ext: variants.

- Barrie

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Tue Jun 17 15:34:05 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.