[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Incomplete SVN dump files

From: Bert Huijben <bert_at_qqmail.nl>
Date: Wed, 16 Sep 2015 11:33:07 +0200

> -----Original Message-----
> From: Andreas Mohr [mailto:andi_at_lisas.de]
> Sent: woensdag 16 september 2015 07:48
> To: Eric Johnson <eric_at_tibco.com>
> Cc: bert_at_qqmail.nl; users_at_subversion.apache.org
> Subject: Re: Incomplete SVN dump files
>
> Hi,
>
> On Tue, Sep 15, 2015 at 05:26:38PM -0700, Eric Johnson wrote:
> > I just checked, and there aren't any open bugs about this.
> > Interrupting svnrdump can result in a dump file with not all the
files of
> > the last commit in the dump record. Accidentally use that dump file
to
> > load into a new repository, and the resulting repository will not be
a
> > copy of the original.
> > My particular use case, I was trying to suck down a large repository.
> > Connection interrupted part way through. I resumed from part way
through
> > (using the --incremental option) into an additional dump file. Then
did a
> > load of those two dump files. Did not yield a copy of the original
> > repository, though.
> > This seems like a critical issue for possible data loss when copying
> > repositories from machine to machine using svnrdump.
>
> AFAICS (not an svnrdump expert here) very well described and to the point.
> You just managed to pinpoint a rather important serialization format
> that seemingly isn't fully properly atomically transaction-safe...
> (good catch!)

In some ways a dumpfile is a stream and not a file... and when you use the
commandline tools you always obtain it from stdout.

I could argue that you in that case should check if the operation exited
successfully or with an error.

After an error you can't trust that the final portion is ok.

The stream was also deliberately designed in a way that you can
incrementally generate it... E.g. after each new revision or as a daily
backup operation.

Adding some 'this is the end' marker would break those use cases, that we
have been using since the day subversion was self-hosted. (Long before 1.0)

And when loading from a stream we can't continue reading to the end to see
if there is a final marker, as at that point we aren't able to go back to
the start and start the whole process.
(I've used '$ svn dump .... | ssh .... svnadmin load ...' more than a few
times for repository migrations)

        Bert
Received on 2015-09-16 11:33:28 CEST

This is an archived mail posted to the Subversion Users mailing list.