[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Incomplete SVN dump files

From: Eric Johnson <eric_at_tibco.com>
Date: Tue, 15 Sep 2015 17:26:38 -0700

I just checked, and there aren't any open bugs about this.

Interrupting svnrdump can result in a dump file with not all the files of
the last commit in the dump record. Accidentally use that dump file to load
into a new repository, and the resulting repository will not be a copy of
the original.

My particular use case, I was trying to suck down a large repository.
Connection interrupted part way through. I resumed from part way through
(using the --incremental option) into an additional dump file. Then did a
load of those two dump files. Did not yield a copy of the original
repository, though.

This seems like a critical issue for possible data loss when copying
repositories from machine to machine using svnrdump.

I suspect the right solution to this is to put an "end of file" marker at
the end of a dump stream. If it isn't there, then svnadmin load will see
its absence, and must discard the last commit.

Eric.

On Tue, Sep 15, 2015 at 6:09 AM, Eric Johnson <eric_at_tibco.com> wrote:

> Hi Bert,
>
> The files that made it into the dump file were complete. It is just that
> the last commit in the dump file didn't have all of the files it was
> supposed to have.
>
> This may be a deliberate design of the dump file format, but it does mean
> that svnrdump is badly broken. Svnrdump should not be dumping a partial
> commit upon network failure!
>
> Eric
>
> On Sep 15, 2015, at 1:52 AM, "bert_at_qqmail.nl" <bert_at_qqmail.nl> wrote:
>
> In what way was the dump file incomplete?
>
>
>
> Was it broken halfway through a file? (That should have been caught via
> the checksums in the file). If a whole node edit is missing it is still a
> complete dumpfile and there is no way the current dump doesn’t know when a
> revision is done. (This allows editing the revisions in this format; as is
> sometimes done on migrations)
>
>
>
>
>
> Bert
>
>
>
>
> *From: *Eric Johnson
> *Sent: *dinsdag 15 september 2015 07:16
> *To: *users_at_subversion.apache.org
> *Subject: *Incomplete SVN dump files
>
>
>
>
>
> I'm in a situation where I'm dumping Subversion repositories from remote
> locations (using svnrdump).
>
>
>
> The repositories are big enough, and the network connections between
> destinations just unstable enough that the repositories aren't making it
> all in one dump call. I've noticed, for one repository in particular, that
> I actually got a dump file that had only a part of the last commit before
> the connection broke.
>
>
>
> When I loaded up the repository, Subversion reported no problems on the
> svnadmin load, but it seems to me it should have noticed that the final
> commit recorded in the dump file was incomplete, and discarded it. Instead,
> it happily loaded it, and reported no problems.
>
>
>
> At least I was lucky enough to check that it was complete, and I used a
> technique <http://superuser.com/a/315138> to drop all but the last
> revision. Now I can load a new dump file from the commit that was
> incomplete.
>
>
>
> This brings me back to my question - shouldn't the load process ignore the
> last commit if it is incomplete in the dump file? That way I know I have an
> error to address!
>
>
>
> Eric.
>
>
>
>
>
>
Received on 2015-09-16 03:27:14 CEST

This is an archived mail posted to the Subversion Users mailing list.