Other points:
* The terms used are unclear.
(Example: 'incremental backup' could mean 'zfs/lvm snapshot', could
mean "find -mtime | xargs tar", and could mean 'incremental dump'.)
I'd rather get a concrete set of facts (read: one that doesn't require
me to guess what the facts are) before suggesting a recovery procedure.
* There are some more subtleties here.
One example: under certain conditions (which depend on your
configuration and on your data), if you don't enable the rep-cache
while re-creating revisions, the pieced-together repository may be
corrupted.
Stefan Sperling wrote on Thu, Mar 24, 2011 at 12:17:19 +0100:
> On Tue, Mar 22, 2011 at 08:27:39PM -0700, bdu12 wrote:
> > Hello,
> >
> > I use to have two SVN repositories and a single trac DB setup running
> > in Ubuntu on vmware. The server had a cron daily job that ran each
> > night doing incremental backups onto an email server (the incremental
> > backups, backs up files that have been changed during the day).
> > Occasionally I also did a snapshot of the system for a complete
> > backup.
>
> I guess this means that you backed up the repository files as
> they appeared on the filesystem (i.e. repos/db folders etc.).
>
> Which repository backend were you using? BDB or FSFS?
> See the file called repos/db/fs-type.
> FSFS has been the default for quite some time so it's likely that you
> have FSFS repositories.
>
> Do you have any dump files of repository data (obtainable via svnadmin dump)?
>
> > My server has now been lost after the Christchurch earthquake in NZ on
> > 22/02/2011 and am trying to rebuild everything from these backups.
> >
> > I have managed to recover an old snapshot from:
> > - 24/01/2011
> >
> > I have also managed to recover all incremental backups except for two:
> > - 28/01/2011
> > - 29/01/2011
> >
> > I managed to take the client laptops out of the office when the quake
> > hit which have different revisions of the DB's from different
> > checkouts done of the different projects.
>
> "DB" as in "Subversion repository"?
>
> Note that a checkout may not mirror any given revision in the
> repository. It working copy can contain mixed revisions.
> See http://svnbook.red-bean.com/nightly/en/svn.basic.in-action.html#svn.basic.in-action.mixedrevs
>
> > Is there a way of rebuilding my server with all of this data? I am
> > not sure what I was working on on the days that I lost the incremental
> > backups.
>
> Once a revision file has been created, it is never changed.
> Newer revisions often refer to data from older revisions.
>
> In FSFS repository filesystems, Subversion stores revision data as
> separate files in repos/db/revs and repos/db/revprops.
> So incremental backups of FSFS repositories should only be adding new files
> to the repository, with one exception. The file repos/db/current contains
> the number of the HEAD revision, and should change in every incremental backup.
>
> If the data you already have tells you enough to figure out what the
> missing changes were, you can try restoring revisions in order.
> Use the full backup you have as a starting point.
> Then copy in new revision files from incremental backups, and also copy
> or adjust the 'current' file.
> For the missing revisions, you will need to manually replay the *exact*
> changes made in them (using checkout and commit from a working copy),
> so that future revisions fit on top.
>
> If that doesn't get you anywhere, a more low-level approach might work:
>
> Try to figure out which files and directories were changed in the missing
> revisions. E.g. try to open every file in every revision (from HEAD downwards)
> and note the path/revision pairs for which svn errors out.
> Now locate revisions that changed the same paths.
> Decode the corresponding revision files from before and after the missing
> revisions and try to interpolate the changes that the missing revisions
> were describing. A tool that can decode FSFS revision files is here:
> https://svn.apache.org/repos/asf/subversion/trunk/contrib/server-side/fsfsverify.py
> This tool parses FSFS revision files and prints their content in
> somewhat human-readable form. To understand the decoded data, refer to
> these documents:
> http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_base/notes/fs-history
> http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_base/notes/structure
> http://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure
> http://svn.apache.org/repos/asf/subversion/trunk/notes/svndiff
>
> A similar approach might work for BDB backends, but off-hand I don't
> know how it would be done. Some of the above links talk about BDB,
> and information within them is also important for understanding FSFS.
>
> > I think in theory I should be able to have the current working copy
> > per project and then perform a reverse diff of the revisions from
> > 30/01/2011 to the date just before the checkout of the project in
> > question. This should then give the version at 30/01/2011. Then
> > using this version with the version of 27/01/2011 should be able to
> > recalculate per project what occurred on the lost dates of 28/01/2011
> > and 29/01/2011. Then these should be able to be used to rebuild all
> > of the repositories as it was before the earthquake.
>
> The problem with this approach is that checkouts (and maybe diffs)
> referring to missing revision data simply won't work.
>
> Good luck!
Received on 2011-03-25 18:27:59 CET