[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: incremental conversion from other SCM to svn by vcp

From: <kfogel_at_collab.net>
Date: 2003-06-13 20:08:14 CEST

Chia-liang Kao <clkao@clkao.org> writes:
> actually the vcp core does only per-file branch point deduction. but
> the revision in svn semantic (svn cp trunk -r <which>) for branching
> is decided by the create_branch function in VCP::Dest::svn i wrote.

Oh! I misunderstood what you were saying in IRC, then; I thought you
were relying on a core function of VCP.

> > - But it doesn't handle tags, because VCP doesn't deduce the tag
> > points in the same way. Instead, it just marks the tags per file
> > revision... which doesn't help us much in Subversion.
>
> it shouldn't be hard to implement that. since the deduction is pretty much
> like I did in branching point: decide the `global point' from from
> points of every files.

Yes, that makes sense.

Some questions:

   - Have you tested the driver on any really big repositories, like
     the FreeBSD CVS repository (2.3 gigs)? Also, that one's good
     because it has a lot of edge cases -- twice-deleted files,
     branches where some files are branched much later than others
     ("split" branches), etc.

   - Is it holding a lot of state in memory, such as all the branch
     paths and things like that?

> as i saw in the profiling from vcp log, svn commit takes some time, the
> longest is 20 sec or so for one large commit. but the bottle neck right
> now is how it extracts every revision from cvs: doing cvs checkout -r
> <revision> <onefile> for every file. i'll be implementing fast retrieval
> of cvs by setting date tag and verifying the resulting revision, hopefully
> this would boost conversion time. but more importantly is that the
> conversion is incremental, so even if the very first conversion of a
> large repository is slow, subsequent conversion of newly committed
> files won't take long.

Well, the total conversion time is still important -- many sites will
be converting once and then using just Subversion. For them, the main
issue is "How long will my developers be shut out of the repository
during this conversion?"

If 2000 revisions was 7 hours, then (say) the main GNU toolchain
repository would probably need a conversion time of several days.
Maybe not a showstopper, but kind of inconvenient. They'd probably
have to copy the repository, then convert the copy, then port over all
changes that came into the master during the conversion... Or
something like that.

(I'm not sure how using date instead of revision will help your CVS
retrieval time? I would think the Subversion commits are a huge
bottleneck... outputting to a dumpfile and then loading it might save
a lot of time.)

> > Because of the tag problem, and the timings, I'm continuing with
> > cvs2svn.py (which I judge to be "near" completion, famous last words).
> > But I hope you're planning to continue with the VCP work -- it's an
> > important gateway to other SCM systems. And if it eventually makes
> > cvs2svn obsolete, no one will be happier than me! :-)
>
> yes I do. basically it's motivated by scratch-ones-own-itches as I
> mentioned on irc. :) another advantage of vcp is that there source
> driver for other scm, specifically p4, which the perl folks are
> currently using, and it's said to be a consensus to switch to svn.

Yah!

Good luck with the tags; let me know if I can help (by supplying test
data or whatever).

-Karl

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Jun 13 20:54:39 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.