Re: [RFC] Altering copyfrom information in repository

From: Johan Corveleyn <jcorvel_at_gmail.com>
Date: Tue, 22 Nov 2011 13:32:02 +0100

On Tue, Nov 22, 2011 at 8:26 AM, Daniel Shahaf <d.s_at_daniel.shahaf.name> wrote:
> On Tuesday, November 22, 2011 12:23 AM, "Johan Corveleyn" <jcorvel_at_gmail.com> wrote:
>> Hi all,
>>
>> I'm wondering if it would be feasible to (make it possible to)
>> alter/add copyfrom information in an SVN repository. And if so, would
>> this be a desirable feature?
>
> It's not feasible for FSFS without a format bump and a new layer of
> code. I believe trivial for BDB though.
>
> I'm not really sure it's desirable. If you are to poke a hole in the
> immutability guarantee it should be a very small one.
>
>>
>> It would certainly be useful (although I can't fully estimate the
>> ramifications) for the following use case:
>>
>> - User commits a move, but for some reason it's lacking copyfrom
>> information (possible reasons are 'svn mv A B; svn mv B A' with svn <
>> 1.7 [1]; or user performed a non-svn move; ...)
>
> Not convinced that we should add a new backend feature because users
> don't know the tool they work with.

Users make mistakes. But they (reasonably) expect that mistakes can be
rectified. In this case, SVN doesn't offer a good way to repair this.

Besides, the issue [1] is not the user's fault, that's an SVN bug.
It's fixed now, but there may be hundreds of commits that are already
in the history that were broken because of this bug. In some cases
users don't care, but sometimes they do (usually it's the most
VCS-conscious users that are quite saddened to see a nice line of
history broken).

>> Currently, the only way I know to repair this, is:
>>
>> svn rm thefile
>> svn copy $URL/thefile_at_REV-BEFORE-BREAKAGE .
>> # replay all the text modifications after the breakage,
>> # and commit them one by one
>> # or alternatively: replace the text by the latest version,
>> # and commit all at once (less nice history (collapsed))
>>
>> This can be a lot of work (especially if a lot of commits have gone by
>> since the breakage, or if multiple files (dirs) are involved which
>> each evolved differently afterwards). Not to mention that it can
>> become quite ugly if commits are replayed one by one (builds failing
>> in the meantime, ...).
>>
>
> It's automateable.

That would be quite a challenge, also taking into account breakage by
directory moves, and subsequent different evolution of all
child-nodes, affected in different commits, ... I'm not up for it,
that's for sure.

Also, think about a release branch that happened after the breakage.
And the fact that, between those commits, you'll be bringing the
repository into a state that might not even compile.

> You also don't mention the "other" way of fixing this --- noticing the
> lack of copyfrom information in the commit mail.

Yes, noticing and fixing it immediately is the ideal case. But some
cases will slip by. And what about those dozens of cases that already
are in my history.

> (or just putting up with it. I could argue that your way is dangerous
> since it introduces duplications into history; anyone reading it would
> have to manually verify that the "repeated" commits are in fact
> identical to the original ones past the break-of-copyfrom)

I agree it's dangerous, but I know of no better way. That's exactly my
point. Ok, putting up with it is another way of coping, but it's not
really a solution :-(.

>> In this case, it would be very useful if one could simply add the
>> missing copyfrom information to the repository. I can think of several
>> possible ways:
>> - On a live repository (like editing revprops (possibly protected by a hook))
>> - With an svnadmin command
>> - By dumpfile manipulation, if nothing else
>>
>
> "By an svnadmin command" is pretty meaningless, it describes the UI
> rather than the implementation.

Ok, I was speaking high-levelish. What I meant was: some tool that an
svn administrator can run, with direct access to the back-end.

> The options you have are:
>
> - live filesystem disk tree
> - non-live filesystem disk tree
> - editor drive
> - dumpstream
> - dumpfile
>
>>
>> Thoughts, opinions, ...?
>>
>
> I'm not convinced that we need to complicate the backend because users
> don't know the tool they work with.

As I said, users make mistakes, and SVN bugs do happen.

> I'd suggest that you switch your repositories to BDB and manually edit
> the noderev skels. Alternatively, patching svnsync or svndumptool should
> be simple; perhaps something like the below, plus some machinery to read
> a hash mapping [path, revision] pairs to their new copyfrom tuples.

Heh :-). Switching to BDB is not an option ...

> [[[
> Index: sync.c
> ===================================================================
> --- sync.c (revision 1202151)
> +++ sync.c (working copy)
> @@ -281,10 +281,16 @@ add_file(const char *path,
>
> if (copyfrom_path)
> copyfrom_path = apr_psprintf(pool, "%s%s", eb->to_url,
> svn_path_uri_encode(copyfrom_path, pool));
>
> + if (eb->base_revision == FOO && !strcmp(path, BAR))
> + {
> + copyfrom_path = "/baz";
> + copyfrom_rev = "42";
> + }
> +
> SVN_ERR(eb->wrapped_editor->add_file(path, pb->wrapped_node_baton,
> copyfrom_path, copyfrom_rev,
> pool, &fb->wrapped_node_baton));
>
> fb->edit_baton = eb;
> ]]]

... and patching sync/dumptool isn't either at this point. But thanks
for the suggestion, that gives me some idea of the minimum minimorum
to do a manipulation like this.

I'm not immediately blocked by this myself, so I won't be going on any
adventures to fix this in our repository (our users are generally
'putting up with it', or fixing it with the 'workaround' mentioned
above).

So my question is more general: wouldn't this be a useful feature? I
think it would be. Ok, it's a form of 'history manipulation', but a
reasonable one I think.

Having a way to do this with svnsync and svndumptool would already be
very useful. It would at least give some assurance to svn admins that
these things are 'repairable'. Being able to fix a live repository
would of course be even better :-).

-- 
Johan

Received on 2011-11-22 13:32:57 CET

This message: [ Message body ]
Next message: C. Michael Pilato: "Re: [RFC] Altering copyfrom information in repository"
Previous message: Daniel Shahaf: "Re: [RFC] Altering copyfrom information in repository"
In reply to: Daniel Shahaf: "Re: [RFC] Altering copyfrom information in repository"
Next in thread: Stefan Sperling: "Re: [RFC] Altering copyfrom information in repository"
Reply: Stefan Sperling: "Re: [RFC] Altering copyfrom information in repository"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]