Re: Should cvs2svn try to collect changesets at all?

From: Karl Fogel <kfogel_at_collab.net>
Date: 2001-04-17 19:49:17 CEST

Jason Molenda <jason-svn@molenda.com> writes:
> I think this will be a tricky problem to solve. Is it worth solving
> from the start? If the goal of collating individual checkins into
> changesets is ignored, you get a subversion repository with lots
> of one-file changes after the cvs2svn conversion, but it'll be a
> lot easier to write cvs2svn. That doesn't seem so bad.
>
> Ignoring all the resource issues, the first step is to generate a
> list of 'changesets', or logically grouped check-ins, to the cvs
> repo. You end up with a list like
>
> 2001-04-10T08:15:10 bob ChangeLog,1.832 foo.c,1.3 bar.h,1.18
> 2001-04-10T08:35:25 susan ChangeLog,1.833 baz.c,1.5 bar.h,1.19
>
> etc.
>
> The scenario I don't think is solvable -
>
> user bob starts his checkin at time T, across four directories,
> A, B, C, and D. He does each directory's checkin separately, but
> with the same commit message and close in time, so it is reasonable
> to expect that cvs2svn could collect all of these into a single
> atomic commit.
>
> At T+30sec, user susan checks in a file in directory C.
>
> At T+45sec, user bob gets to C, finds that his directory is not up to date,
> updates (and maybe resolves some conflicts) and does his commit.
>
> I don't see how you can handle this scenario. Even detecting it
> would be difficult. The changeset list would look like
>
> 2001-04-10T08:15:10 bob A/f,1.3 B/e,1.8 C/d,1.4 D/c,1.18
> 2001-04-10T08:35:25 susan C/d,1.3
>
> cvs2svn has to notice that the T+30 checkin has an earlier revision
> of a file than the T checkin, and break apart the T checkin into
> two separate checkins, or maybe reschedule it to T+31.

It's quite detectable. If cvs2svn's "call-it-a-commit window" is
greater than 45 seconds wide, then Bob's change to C is still part of
his commit. It doesn't matter that Susan got one in before him.

It's not horrible if cvs2svn can't always order two commits that
happened simultaneously and affected some of the same things. It will
just have to arbitrarily put one before the other. Since CVS itself
doesn't force an ordering, this is the best we can do with the
information available. Not a big deal, though.

> In an attempt to collect individual cvs checkins into groupings,
> we paper over the fact that cvs doesn't have atomic checkins, and
> eventually cvs2svn will lose because of that.

I don't see the lossage here.

(The cvs2cl script uses these heuristics to identify atomic CVS
checkins already. If it were more efficent, in fact, it could be used
to build cvs2svn's metadata table; but I wouldn't want to run it on a
49 gig repository. :-) ).
Received on Sat Oct 21 14:36:28 2006

This message: [ Message body ]
Next message: Jim Blandy: "Re: Should cvs2svn try to collect changesets at all?"
Previous message: Karl Fogel: "Branko, are you out there?"
Maybe in reply to: Jason Molenda: "Should cvs2svn try to collect changesets at all?"
Next in thread: Jason Molenda: "Re: Should cvs2svn try to collect changesets at all?"
Reply: Jason Molenda: "Re: Should cvs2svn try to collect changesets at all?"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]