[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Large repository with a long checkout problem

From: Nico Kadel-Garcia <nkadel_at_comcast.net>
Date: 2006-08-07 17:18:14 CEST

Troy Curtis Jr wrote:
> On 8/5/06, Nico Kadel-Garcia <nkadel@comcast.net> wrote:
>> Troy Curtis Jr wrote:
>>> Hello all,
>>>
>>> I am attempting to convert my the RCS repository that we use at work
>>> over to subversion. I was able to successfully use cvs2svn (with a
>>> couple of tweaks) and got all of our changes into the shiny new
>>> subversion repo. The first issue is that it ballooned a ~1GB RCS
>>> repository to ~2GB, but we can deal with that...especially since it
>>> took us 12 years or so to get the first 1GB. Also, the HEAD
>>> revision is 60202, so there are a lot of commits.
>>>
>>> The main problem is checkout times, which are sitting at 5.5 minutes
>>> over LAN. I can get the same code down to 1.5 minutes (which
>>> matches our current method) if I export the HEAD revision and
>>> import it into a brandnew repo. Now, I am using FSFS which I know
>>> has longer checkouts because it must go through all the diffs back
>>> to the original version of each file. I have looked at Berkley DB,
>>> but I just know that some of the more "seasoned" (read: resistant
>>> to change) will get very upset the first time they have to go do a
>>> "svnadmin recover", or worse yet, wait for me to do one.
>>>
>>> All that background to ask this question: Is there a way to tell
>>> subversion to build a complete revision at some specific point and
>>> then diff from there on(note that I only care about doing this to
>>> the trunk, the branches can stay where they are). This way the
>>> svnserver will only have to go back a relatively few revisions. But we
>>> have to have the past revisions, so I cannot just start a
>>> new repository at some arbitrary point. Specifically we need the
>>> log messages of all those revisions. Exporting the HEAD, then
>>> deleting all the files, and then importing the exported files would
>>> get the fresh copy of the code set into a specific rev, but then I
>>> would not have the logs and previous diffs for all the prior
>>> revisions.
>>
>> I think you can do an "export", import that to a new repository,
>> then get a differential dump of the changes to apply on top.
>>
>
> I am not exactly sure what you mean in in this paragraph, but it
> sounds kinda promising. Could I ask you to elaborate a little?
>
>
>> Also, eliminating "empty" changes may help significantly reduce the
>> number of changes, but do*not* try that stunt without starting a new
>> URL to the repository to force a successful "svn switch".
>>
>
> Is there a easy way (read: scriptable) way to do this. And how do I
> "eliminate them? I suppose it would involve doing something like
> doing dumps with ranges 0->(empty_rev -1) then (empty_rev + 1)
> ->(next_empty_rev -1).

*DISABLE* the old URL. This is often easily done by sending out a warning
note, sending out another warning note just before you actually do the move,
adding a new hostname for the SVN server, and making sure that any attempts
to access the old one are entirely blocked. (Easily done in pre-commit, if
you can't trivially cut off the service entirely.)

Then when people whine and complain about not having access, point them to
the mail you sent out, and send them an appropriate bit of shell or perl
script to look at the "svn info" of their checked out companies, rewrite it
as the new URL, and do the switch..
 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Mon Aug 7 17:27:00 2006

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.