[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Antwort: Re: cvs2svn.py fails converting repository with the attached file

From: mark benedetto king <mbk_at_lowlatency.com>
Date: 2003-06-27 19:19:26 CEST

On Fri, Jun 27, 2003 at 11:02:44AM -0500, kfogel@collab.net wrote:
> mark benedetto king <mbk@lowlatency.com> writes:
> > I had to do some rcs file mangling recently, with thousands of revisions
> > of very large files. Using co directly was unacceptable because of the
> > O(N^2) nature.
> >
> > Instead, I used a slightly modified rcsparse to extract not only the change
> > metadata, but the deltas themselves, and the fulltext of the HEAD.
> >
> > I took HEAD and picked up the deltas in reverse order, reconstructing
> > all of the fulltexts in N passes (there were no branches in these
> > rcs files).
> >
> > This gave me a tremendous speedup, but wouldn't it also allow us to
> > remove the requirement for "co"?
> Yeah -- Greg (Stein) and I were recently talking about doing just
> this, in fact. I assume this technique caused massive disk usage,
> since you had to keep all those fulltexts around in order to avoid the
> N^2 behavior?

Yes, that's true. Lucky for me, I was able to deal with one ,v at a time.
Since cvs2svn wants all fulltexts for all ,v files for a particular txn
(I presume) that would be quite a bit of data; essentially every fulltext
of every file all at once.

Aha! We could *invert* the delta as we go. I.e., start with HEAD, work
back to rev-1, dropping forward-deltas along the way. Neat, though
it does trade the CPU and IO for the space savings.

Attached is my quick-hack-in-perl RCS delta applyer. It may not be
completely correct, but it (seems to have) worked for all of my data.

I'm sure it would port pretty quickly to python.


To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Received on Fri Jun 27 19:19:36 2003

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.