[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Large repository with a long checkout problem

From: Troy Curtis Jr <troycurtisjr_at_gmail.com>
Date: 2006-08-06 23:21:12 CEST

On 8/5/06, Nico Kadel-Garcia <nkadel@comcast.net> wrote:
> Troy Curtis Jr wrote:
> > Hello all,
> >
> > I am attempting to convert my the RCS repository that we use at work
> > over to subversion. I was able to successfully use cvs2svn (with a
> > couple of tweaks) and got all of our changes into the shiny new
> > subversion repo. The first issue is that it ballooned a ~1GB RCS
> > repository to ~2GB, but we can deal with that...especially since it
> > took us 12 years or so to get the first 1GB. Also, the HEAD revision
> > is 60202, so there are a lot of commits.
> >
> > The main problem is checkout times, which are sitting at 5.5 minutes
> > over LAN. I can get the same code down to 1.5 minutes (which matches
> > our current method) if I export the HEAD revision and import it into a
> > brandnew repo. Now, I am using FSFS which I know has longer checkouts
> > because it must go through all the diffs back to the original version
> > of each file. I have looked at Berkley DB, but I just know that some
> > of the more "seasoned" (read: resistant to change) will get very upset
> > the first time they have to go do a "svnadmin recover", or worse yet,
> > wait for me to do one.
> >
> > All that background to ask this question: Is there a way to tell
> > subversion to build a complete revision at some specific point and
> > then diff from there on(note that I only care about doing this to the
> > trunk, the branches can stay where they are). This way the svnserver
> > will only have to go back a relatively few revisions. But we have to
> > have the past revisions, so I cannot just start a new repository at
> > some arbitrary point. Specifically we need the log messages of all
> > those revisions. Exporting the HEAD, then deleting all the files, and
> > then importing the exported files would get the fresh copy of the code
> > set into a specific rev, but then I would not have the logs and
> > previous diffs for all the prior revisions.
>
> I think you can do an "export", import that to a new repository, then get a
> differential dump of the changes to apply on top.
>

I am not exactly sure what you mean in in this paragraph, but it
sounds kinda promising. Could I ask you to elaborate a little?

> Also, eliminating "empty" changes may help significantly reduce the number
> of changes, but do*not* try that stunt without starting a new URL to the
> repository to force a successful "svn switch".
>

Is there a easy way (read: scriptable) way to do this. And how do I
"eliminate them? I suppose it would involve doing something like
doing dumps with ranges 0->(empty_rev -1) then (empty_rev + 1)
->(next_empty_rev -1).

> > Do you guys have any suggestions on how to solve this? If I cannot
> > get the fsfs checkout times down, I will have to go to the bdb, but I
> > just know that some of those guys are really going to give me flack
> > when the repo gets wedged.
>
> Look for big files that might require lengthy checksums, and directories
> that have more than a few thousand files in them (if you are using ext2 or
> ext3 from before kernel 2.6 on Linux).
>
>

-- 
"Beware of spyware. If you can, use the Firefox browser." - USA Today
Download now at http://getfirefox.com
Registered Linux User #354814 ( http://counter.li.org/)
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sun Aug 6 23:22:25 2006

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.