[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [RFC] Altering copyfrom information in repository

From: Talden <talden_at_gmail.com>
Date: Thu, 24 Nov 2011 00:25:54 +1300

On Wed, Nov 23, 2011 at 3:04 AM, Stefan Sperling <stsp_at_elego.de> wrote:
> On Tue, Nov 22, 2011 at 01:32:02PM +0100, Johan Corveleyn wrote:
>>
>> Having a way to do this with svnsync and svndumptool would already be
>> very useful. It would at least give some assurance to svn admins that
>> these things are 'repairable'. Being able to fix a live repository
>> would of course be even better :-).
>
> Editing dump files has always been the approach to fixing up mistakes in
> history. So I don't think we need a change in the FS backends. Instead
> we need better dump file editing capabilities. svndumptool might go
> part of the way. But ideally this would be built into svndumpfilter or
> a new offical tool that edits dump files (rather than just filtering
> nodes from a dumpfile).

The approach I've taken a few times is to clone the repo up to but
excluding the faulty revision, do an incremental dump of revisions
(full-text, not deltas) following the faulty revision.

Apply the patch of the faulty revision to a WC at the clones tip (but
with corrected gestures), commit, load remaining revisions into the
clone. Swap the master with the clone at an opportune time when the
clone can get caught up and we know that all users are no longer using
WCs at the problem revision.

This seems a safe path. Step 1, making the clone of the bulk of
history is pretty slow the official way though (via dump/load).

I'd like to know if there's a reliable way to hot-copy a repo and then
roll it back to a specific revision (trashing the newer revisions) - I
haven't looked into how safe that is since repcache and sharding came
into being but that's the only way this approach could be used on our
repository given that making a clone to a specific revision using
dump/load will take a day or so to get anywhere close to our repo tip.

Things I've had to fix
- lost history (the example of this thread)
- Squishing revisions (and padding with empties). People forgetting
to commit a whole tree in merges seems to be the main culprit here or
people using silly tools that commit every save/rename/add separately.
- Stripping content - big binaries that are removed from HEAD in short
order - 0-value history. really a space-is-precious special-case of
the previous motivation.

In rare-cases I've done some dump-file editing to reshape the tree
(truly rewriting history). I'm always wary of that - you have to be
very sure that the loss in real-history is worth the cleanup.

I'd like to do this a lot more often than I do, but the tools are poor
to achieve it. Sad when you consider that this is one of the
strengths of the centralised model - that you can fix history - in a
DVCS once it's out the gate you're pretty much done for unless you
force everyone else to reclone from your rebase-point and forget any
history they had intermingled in the abandoned timeline. Shining here
would make Subversion even more attractive to the corporate space.

--
Talden
Received on 2011-11-23 12:26:27 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.