RE: How Big A Dump File Can Be Handled?

From: Geoff Field <Geoff_Field_at_aapl.com.au>
Date: Wed, 21 Aug 2013 16:29:17 +1000

> From: Ben Reser
> Sent: Wednesday, 21 August 2013 12:12 PM
> On Tue Aug 20 16:44:08 2013, Geoff Field wrote:
> > I've seen some quite large dump files already - one got up
> > to about 28GB. The svnadmin 1.2.3 tool managed to cope with
> > that quite successfully. Right now, our largest repository
> > (some 19,000 revisions with many files, including
> > installation packages) is dumping. In the 5300 range of
> > revisions, the dump file has just passed 9GB.

Overall, it got to about 29GB. Dump and load worked fine, although they got a bit slow towards the end. (In fact, I delayed sending this until it had actually finished.)

> Shouldn't be a problem within the limits of the OS and filesystem.

I've just realised that my concern was based on a power-of-2 limitation that means that a 32-bit signed integer would roll over at the 2GB mark, with an unsigned roll-over at 4GB. It's possible the Windows Server 2003 file system might have started to complain when it ran out of block indices/counters or some such, but there's no reason a 32GB+ file won't work if 4.1GB or more works.

> However, I'd say why are you bothering to produce dump files?
> Why not simply pipe the output of your dump command to a
> load command, e.g.
>
> svnadmin create newrepo
> svnadmin dump --incremental oldrepo | svnadmin load newrepo

I've been working in Windoze too long - I failed to think of that option. I'll use that for the rest of the repositories (about 19 remain to be done). Thank you for that application of the clue-by-four. You've made the rest of my task a lot easier.

I really should have done it all using a scripting language of some sort, too. I've told myself it's really too close to the end of the process to think of *that* change now, except I've just managed to quickly throw together a batch file to do the job. I could probably have done it in python or some other scripting language, but batch files are quick and easy. Again, thanks Ben for the prompt to use my head a bit better (even though you didn't explicitly suggest this aspect).

CopyBDBToFSFS.bat:

  rem Create a new repository - using the OLD format just in case we need to switch back to the old server
  "C:\Program Files\Subversion\bin\svnadmin.exe" create "%1_FSFS"
  rem Copy the data from the old repository to the new one
  "C:\Program Files\Subversion\bin\svnadmin.exe" dump --incremental "%1" | "C:\Program Files\Subversion\bin\svnadmin.exe" load "%1_FSFS"
  rem Change the names to make the new repository accessible using the existing authentication and URLs and the old one accessible for emergency use.
  ren "%1" "%1_BDB"
  ren "%1_FSFS" "%1"
  rem Check the new repository with the current tools to confirm it's OK.
  svnadmin verify "%1"

Note that we have the old version 1.2.3 server software installed at the C:\Program Files\Subversion location, and later versions are stored under other locations, with the path set to point to the new version. I'm creating the new repositories with the old version for those (hopefully rare) occasions when we need to switch back to the old server version.

> You'll need space for two repos but that should be less than
> the space the dump file will take.

We're keeping the old repos anyway, just in case. We're an automotive parts company with support requirements for some quite old versions, so we can't afford to throw away too much history. Even though it's a RAID system (using Very Expensive disk drives, so it's actually a RAVED system), there's lots of space available on the drive where the repositories live.

> I included the
> --incremental option above because there's no reason to
> describe the full tree for every revision when you're doing a
> dump/load cycle.

That makes sense.

> You can save space with --deltas if you
> really want the dump files, but at the cost of extra CPU time.
> If you're just piping to load the CPU to calculate the delta
> isn't worth it since you're not saving the dump file.

I agree. The server's not particularly new, so if I can save on processor time that's a good thing. I'm discarding/reusing the dump files anyway, since we're keeping the original repositories (and we have a separate backup system for the servers - I know it works too, because I've had to restore some of the BDB repositories from it).

Regards,

Geoff

-- 
Apologies for the auto-generated legal boilerplate added by our IT department:
- The contents of this email, and any attachments, are strictly private
and confidential.
- It may contain legally privileged or sensitive information and is intended
solely for the individual or entity to which it is addressed.
- Only the intended recipient may review, reproduce, retransmit, disclose,
disseminate or otherwise use or take action in reliance upon the information
contained in this email and any attachments, with the permission of
Australian Arrow Pty. Ltd.
- If you have received this communication in error, please reply to the sender
immediately and promptly delete the email and attachments, together with
any copies, from all computers.
- It is your responsibility to scan this communication and any attached files
for computer viruses and other defects and we recommend that it be
subjected to your virus checking procedures prior to use.
- Australian Arrow Pty. Ltd. does not accept liability for any loss or damage
of any nature, howsoever caused, which may result
directly or indirectly from this communication or any attached files.

Received on 2013-08-21 08:30:12 CEST

This message: [ Message body ]
Next message: Ben Reser: "Re: How Big A Dump File Can Be Handled?"
Previous message: Ben Reser: "Re: svn 1.8.1: segmentation fault on merge"
In reply to: Ben Reser: "Re: How Big A Dump File Can Be Handled?"
Next in thread: Ben Reser: "Re: How Big A Dump File Can Be Handled?"
Reply: Ben Reser: "Re: How Big A Dump File Can Be Handled?"
Reply: Thorsten Schöning: "Re: How Big A Dump File Can Be Handled?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]