Re: [RFC] Altering copyfrom information in repository
From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Wed, 7 Dec 2011 12:40:19 +0000 (GMT)
Hi Johan. See below...
On 28 November 2011, Johan Corveleyn wrote:
That could certainly be helpful in implementing one part of any such history-editing feature. I see two difficult areas. Let's say you change rX.
From a high-level point of view, what result do you want when a subsequent revision rY (where Y > X) touches a file or directory that would have existed in rX but no longer exists in rX because of the change made to rX? It's not difficult to specify some reasonable options here (things like: adjust rY to leave the final state of rY just as it was, which may involve recreating any nodes that were obliterated from rX; or delete the node; or bail out), it's just a matter of choosing, so in a sense this isn't a difficulty just a design choice.
From an implementation POV, as soon as you replace rX with a new rX, the subsequent revisions in the repository become invalid unless the change you made to rX was very simple. Any deltas based on rX, any copy-from pointers, node Ids, and so on, may become invalid. So you can't in general replace rX inside the repository. If you did so, then r(X+1) up to HEAD would immediately become more or less unreadable, broken. One solution is to copy the whole repo up to r(X-1) and then load the new revisions into that copy of the repository. But if you really want to do this inside the repository, which is what I was trying to do, then in order to fix up all the revisions rX+1:HEAD you need to do something like either keep track in memory of what you are updating and rewriting, which gets quite complex; or fork the history inside the repository (leave the old rX in place, write a new chain of revisions rX' rX+1' rX+2', while reading from the original
The benefit of 'forking' the chain of revisions is that the repository filesystem code can read the old revisions on request, and so you could for example convert them into dump file format. Conversely, to keep track in memory of what you are updating and rewriting, and traverse rX:HEAD fixing up as we go, that necessarily must be done at a very low level because those revisions are already 'broken' by the time we come to fix them up, and so they cannot be read by the normal APIs.
That's the stuff I tried to get my head around before.
If we choose to only support some very limited transformations within rX, then the 'traverse rX+1:HEAD, fixing them up as we go' approach could perhaps be simple enough to be feasible. But it's still low-level code and thus specific to each FS back-end, with the problem that FSFS is more in demand but BDB is much easier to do this sort of thing.
Now I'm thinking the 'fork history inside the repo' or 'clone the repo' approaches are better, even though they require more disk space and/or more time, because being higher level gives several advantages. If we adapt your idea of making it easier to 'replace' a revision, and instead make it easier to import and export a revision, then that would certainly be a useful part of such a solution.
> As I said earlier in this thread, I'm staying away from direct
This is an archived mail posted to the Subversion Dev mailing list.