On Wed, Apr 16, 2008 at 1:23 PM, Karl Fogel <kfogel_at_red-bean.com> wrote:
> This isn't 1.5-related, but I wanted to post it before I forgot.
>
> Here's a cheap plan for implementing 'svn obliterate'. I'm interested
> in implementing it, but there are more pressing things on my plate right
> now, and there's no reason it has to be me. If someone wants to run
> with this, go for it! I will link to this mail from issue #516.
>
> The Plan:
> =========
>
> Use svn_repos_replay2() to send changes through an "obliterate editor"
> (defined below) to create a new repository that doesn't have whatever is
> being obliterated. As necessary, run repeated "catch up" passes until
> HEAD is the same in both repositories. Then lock the old repository and
> replace its db/ subdir with the one from the new repository. Finally,
> remove the remainder of the old repository.
>
> An "obliterate editor" is an editor created and massaged like so:
>
> /* Set @a *editor and @a *edit_baton to an editor that can obliterate
> * history from @a repos. Allocate the editor in @a pool.
> *
> * Pass @a *edit_baton to svn_repos_obliterate_path() and
> * svn_repos_obliterate_rev() as many times as necessary to specify
> * the obliterations you want, before you use @a *editor.
> *
> * @note @a *editor->open_root() creates a new temporary repository
> * (with the same UUID as @a repos) into which the filtered data is
> * replayed. When done, @a *editor->close_edit() locks @a repos,
> * splices the relevant parts of the temporary repository into
> * @a repos, and removes the temporary repository.
Does it want to create a new temporary repository, or just a new
temporary filesystem? Where would it put this temporary repository?
Relevant things to worry about are whether the "move the db/s around"
step ends up moving things between (OS) filesystems or not, as well as
security (is it likely to put the temporary repository something where
SVNParentPath can find it, say?)
Also, API-wise, it would seem to me that the "obliterate editor"
should just do the filtering, and code around it should create the
temporary repository, clean it up, etc. So it would just be a
customizable filter, without bundling in the "makes a temporary repo"
functionality.
--dave
> *
> * @a *editor->abort_edit() just removes the temporary repository.
> */
> svn_repos_get_obliterate_editor(const svn_delta_editor_t **editor,
> void **edit_baton,
> svn_repos_t *repos,
> apr_pool_t *pool);
>
>
> /* In @a edit_baton (received from svn_repos_get_obliterate_editor())
> * specify that @a path and all its copies are to be obliterated.
> *
> * If @a rev is SVN_INVALID_REVNUM, then obliterate every occurrence
> * of @a path in the repository, no matter what its contents or
> * provenance, as though it had never been committed, and likewise
> * obliterate every copywise descendant of @a path.
> *
> * If @a rev is not SVN_INVALID_REVNUM, then obliterate @a path as it
> * appears in @a rev: that is, find the revision in which @a path in
> * @a rev was first committed and make that change not have happened,
> * so that the next change to @a path is whenever it was next
> * committed to after that. Then obliterate all copywise descendants
> * of @a path as it appears in @a rev, except for those that can be
> * re-assigned a copy-history from an unobliterated node revision (in
> * which case, do so).
> *
> * ### TODO: That last requirement is kind of complicated, and there
> * ### may be other reasonable ways to behave too. What can I say:
> * ### this kind of question is precisely why we haven't implemented
> * ### obliterate yet. For those who thought the obstacle was
> * ### difficulty of implementation, rather than the difficulty of
> * ### determining the right behaviors: now do you see what I meant? :-)
> *
> * If @a obliterate_identicals is true, obliterate every version of
> * every path in the repository that has contents identical to @a path
> * (in @a rev if @a rev is not SVN_INVALID_REVNUM).
> *
> * Use @a pool for temporary allocation only.
> *
> * ### TODO: Should we offer a 'keep_copies' flag? I don't see a
> * ### compelling use case for it, though.
> */
> svn_repos_obliterate_path(void *edit_baton,
> const char *path,
> svn_revnum_t rev,
> svn_boolean_t obliterate_identicals,
> apr_pool_t *pool);
>
>
> /* In @a edit_baton (received from svn_repos_get_obliterate_editor())
> * specify that @a rev is to be treated like it never happened.
> * That is, for each path P changed in @a rev, have the same effect
> * as calling:
> *
> * svn_repos_obliterate_path(@a edit_baton, P, @a rev, FALSE, @a pool)
> */
> svn_repos_obliterate_revision(void *edit_baton,
> svn_revnum_t rev,
> apr_pool_t *pool);
>
> Advantages:
> ===========
>
> * The repository remains accessible while obliterate runs, since it just
> locks the repository for a constant amount of time at the end, like
> commit. On the other hand, admins are certainly free to make the
> repository inacccessible during the obliteration if they want to. We
> should probably offer a flag ('svnadmin obliterate --lockout') to make
> that easy to do.
>
> * Obliterate is server-side only, you have to be an admin to run it.
>
> Disadvantages:
> ==============
>
> * The cost of an obliterate is proportional to the total number of paths
> in the repository, not to the number of things being obliterated. Oh
> well. Since it doesn't have to block access, I'm not sure how
> important this is. Also, we can detect certain shortcut cases and do
> them more quickly (for example, deleting HEAD is just a matter of
> removing and tweaking some files/directories; deleting a revision
> older than HEAD when nothing touched in that revision changed between
> then and HEAD is subject to similar shortcuts).
>
> * Obliterate is server-side only, you have to be an admin to run it.
>
> The I-Know-What-You're-Thinking Department:
> ===========================================
>
> Yes, we'll need to rev svn_repos_replay2() to make svn_repos_replay3(),
> which has the following changes:
>
> - There's a new flag 'fulltext_only' that tells the driver to assume
> the consumer has no access to the prior content of files.
>
> - The 'send_deltas' flag is changed to 'skelta_only' and its sense is
> reversed.
>
> The point of the first change is to allow us to leave out one revision
> of a file and still be able to receive subsequent revisions of that
> file. The point of the second change is to rename the 'send_deltas'
> flag, so the first change won't be confusing.
>
> General Discussion:
> ===================
>
> In svn_repos_obliterate_path() and svn_repos_obliterate_revision(), you
> can see that there are many, many possible ways they *could* behave, and
> they are arguments for and against all of them. I picked the above
> behaviors somewhat at random. Figuring out how those APIs should work
> has been the central problem of designing 'svn obliterate' all along,
> and is the real reason it has not been implemented yet. I hope that by
> presenting the problem in API form, I've at least clarified the
> questions somewhat.
>
> This proposal doesn't really discuss the command-line interface. I
> think we should decide on the programmatic API and let that suggest the
> natural user interface. (Of course, both should ultimately be derived
> from use cases.) Here I'm just trying to give an overview of *how* to
> implement whatever we decide, not launch the bikeshed-painting party
> that will inevitably come.
>
> As long promised, we completely punt on the working copy side. Admins
> are on their own there. (In fact, sending obliteration commands down to
> the client would be rather undesirable in some use cases anyway.)
>
> -Karl
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
> For additional commands, e-mail: dev-help_at_subversion.tigris.org
>
>
--
David Glasser | glasser@davidglasser.net | http://www.davidglasser.net/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-04-16 22:42:16 CEST