[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: cvs2svn

From: Greg Stein <gstein_at_lyra.org>
Date: 2001-04-16 22:02:08 CEST

On Mon, Apr 16, 2001 at 02:50:36PM -0400, Greg Hudson wrote:
> > At the moment, I believe the timestamp is merely a property on the
> > revision (there isn't an official timestamp now, and when asked a
> > while back, jimb said "it'll be a property"). With that in mind, we
> > ought to be able to set it arbitrarily.
>
> Just as a note, we need to keep track of the checkin timestamp
> separately from the file timestamp. For imports. It's a sad fact of
> life that software these days doesn't always build properly if you
> change the timestamps around (most commonly, it tries to rebuild
> shipped files with tools you don't have or which you have the wrong
> version of).

Woo. Good point. I never thought of that one. But then I tend to keep
generated files out of source control :-)

> > However, the primary key for that is a (hash, userid, time) tuple.
>
> (I realize this is getting kind of deep into implementation issues,
> but:) Hash the userid together with the log message.

Excellent call, sir!

Yes, this is getting kind of deep, but after just speaking with Ben, he
pointed out that cvs2svn needs to be completed *much* sooner than I had
thought. If we're making M3 "self host", then we need to convert our
repository. So... implementation issues are near enough to be up for
discussion :-)

Another corollary is that the SWIG bindings for libsvn_fs need to be
completed sooner rather than later (because there is *no* way that I'll
write that converter in C :-). We can skip most of the bindings for the
other libraries, and we only need to prep/verify one language binding to
libsvn_fs (rather than all six which are currently targeted).

> > We can do a preliminary bin-sort on the hash
>
> This might be second-guessing gsort too much; not sure.

Absolutely. First version will just produce a monster log. If gsort can't
handle it, then I'll do the bins. Mostly, that was me thinking aloud, "wow.
do we have a solution in case gsort takes five days to sort that file?" My
thought experiment leads me to answer, "yes." The best that gsort can do is
N log N, but we can set things up to reduce N for people, so it would become
M * (N/M log N/M).

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/
Received on Sat Oct 21 14:36:28 2006

This is an archived mail posted to the Subversion Dev mailing list.