[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump

From: Luke Perkins <lukeperkins_at_epicdgs.us>
Date: Wed, 25 Jan 2017 08:02:08 -0800

Martin and team,

Statement: "So the only way to solve your problem is to create a tool which parses the dump files and creates a checksum in a defined way so that they are comparable."

Agreed. My thoughts exactly.

Thank-you,

Luke Perkins

-----Original Message-----
From: Martin Furter [mailto:mfurter_at_bluewin.ch]
Sent: Tuesday, January 24, 2017 19:56
To: lukeperkins_at_epicdgs.us
Subject: Re: [PATCH] Issue #4668: Fixing the node key order during svnadmin dump

On 01/25/2017 03:15 AM, Luke Perkins wrote:
> Michael,
>
> I appreciate everyone's audience on this issue. I have not felt a need to be directly involved in the subversion system mainly because it works so well. This is the first time in 10 years I have felt the need to get directly involved in the SVN development team.
>
> Statement: " As a bug report alone, this one seems pretty easy: Closed/INVALID."
>
> I completely disagree with this statement. I have nearly 300GB of dump files used as a means of backing up my repositories. Some of these dump files are 10 years old. The incremental SVN dump file is automatically generated at each and every commit. After these incremental SVN dump files are created, they are copied and distributed to offsite locations. That way if my server farm crashes, I have a means of assured recovery.
>
> Every month I run sha512sum integrity checks on both the dump files (remotely located in 3 different locations) and the dump file produced by the subversion server. Transferring thousands of 128 byte files is a much better option than transferring thousands of MB dump files over the internet to remote locations. This method and automated scripts have worked for 10 years. I have rebuilt my servers from the original dump files on at least 2 occasions because of computer crashes. This provides me a sanity and validation methodology so that I can spot problems quickly and rebuild before things get out of hand.
>
> Asking me to redistribute 300GB of data to 3 different offsite (and remote) locations, is not a good option.
>
> The SVN dump file has always been presented as the ultimate backup tool of the subversion system. The integrity of the SVN dump file system is of paramount importance. The whole reason why SVN exists in the first place is "data integrity and traceability". The code was changed back in 2015, for better or worse, and we need present solutions to address legacy backups.
A stable order of header lines will solve your problem for now. But in the future somebody might add a new feature to subversion and a new header field to the dump files. This will break your checksums again.
Back in the pre-1.0 days when I was working on svmdumptool i had the same troubles with changing headers and new fields. So the only way to solve your problem is to create a tool which parses the dump files and creates a checksum in a defined way so that they are comparable.

- Martin
Received on 2017-01-25 17:02:22 CET

This is an archived mail posted to the Subversion Dev mailing list.