Karl Fogel wrote:
> The hard part is to define exactly what it will do, see
> for more on that.
> We badly need the feature, everyone agrees -- but defining exactly
> what the feature is is the hard part :-).
I read the comments in the bug report.
The feature I need is to remove a file/directory from the repo.
It seems to me (naively) that this means removing any deltas,
any history, any logs, etc. Make it as if the file/dir were
never added in the first place. This would seem to address some
of the problems mentioned: someone checks in binaries or checks
in a source tree in the wrong location.
Tagging a node as "obliterated" in the repo would not recover
any wasted space, but it would logically act as if the file/dir
had been deleted. A compress operation (i.e., dump/restore
of the archive) would recover the space. Not as nice as recovering
the space when the file/dir is obliterated, but it makes the
obliterate function reversible. (The more secure variation
is to replace the node data with a comment like "redacted" as
mentioned in the bug report, but that makes the obliterate
operation irreversible.) (This was Jason Robbins suggestion.)
I'm a bit unclear about the comments that obliterating
a file/dir would make working copies invalid. What happens
when svn sees a entry for a file in a working copy when no
file exists in the repo? Or better, what should happen?
There were also comments about deltas against obliterated
nodes. I'm unfamiliar with the internals of svn, but if there
are deltas against a file which is obliterated, then it seems
like all of these deltas should also be obliterated. I don't
understand the comment about needing to "re-delta" nodes.
It may be that there is a different requirement to obliterate
some intermediate update, while retaining the file/dir in
the repo. That's something different from what I would like
to see. Some of the comments (like those of Ben Collins-Sussman)
seem to address this problem.
It seems that for many applications it would be satisfactory
to take the repo offline while compressing (dump/restore) it,
eliminating space used by obliterated files. There are comments
that this is not a satisfactory solution for large repos, but it
would seem to provide a stepping stone to a more comprehensive
That better solution might be to walk the database looking
for obliterated nodes and then remove them and any nodes
which reference them, recursively. (Obviously, this is done
bottom up, removing nodes which have no references and removing
the references to the nodes.) I'm speaking from a naive viewpoint,
since I haven't looked at the code. This sounds to me like
something which can be done in the background as a maintenance
activity, without taking the repo offline.
Michael Eager firstname.lastname@example.org
1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
To unsubscribe, e-mail: email@example.com
For additional commands, e-mail: firstname.lastname@example.org
Received on Tue Sep 5 21:10:26 2006