[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Another request for obliterate...

From: Weintraub, David <David.Weintraub_at_ilex.com>
Date: 2005-04-19 17:39:50 CEST

The Prune idea is what I am basically thinking of an "obliterate obliterate"
command. It starts with the most resent revision, and prunes backwards all
of the versions. It will not allow you to prune if there is no version in
the latest version of the directory, or if there are any copied URL links to
another directory. I would also limit it to just go though pruning no more
than "X" versions of the archive. By limiting the "obliterate obliterate",
you simplify the implementation and make sure you're not removing anything
"interesting".
 
This will not work with the "rmversion obliterate" command I was referring
to. Let's say the history of file foo.c looks like this:
Version Description
======= ===========
18 Good Build
19 Release 1.0

20 Bad Build
24 Good Build
27 Bad Build
28 Bad Build
31 Good Build
I want to remove versions 30, 27 and 28, but I want to keep versions 18, 19,
24, and 31. In a few weeks, I will also be removing versions 18, 24, and 31,
but I never want to remove version 19. Under your scenario, I can only prune
completely backwards. That is, if I want to remove version 20, I would have
to remove versions 24 and 31 too (which I don't want to do).
 
I realize that this is a very difficult task to program into Subversion, and
I am not expecting anything soon. There are a lot of issues to work out. For
example, if I remove version 27 and 28 of this file, what is in archive
version 27 and 28? Should it be the same file that was in version 24? Maybe
I shouldn't be able to rmversion a single file, but have to rmversion an
entire archive so there won't be a version 27 and 28 of the archive. As long
as those versions of the archive aren't linked to anywhere else, I might be
able to assume these particular versions of the archive are not interesting.
 
Of course, the funny thing is in ClearCase, I labeled all of the builds --
including the bad ones. After all, how do I know a build is bad until it is
built and tested by our team? If a build ended up being bad, I deleted the
label which allowed me to rmversion the built binaries. If I want to
duplicate this in Subversion, I would have to have someway to "uncopy" a
tag, or be able to let the rmversion command know it is okay to remove a
particular version even if it is copied to the tag directory.
 
Which would make this type of command even more difficult to implement.
Meanwhile, I now have a reason why binaries should not be stored in
Subversion.

 -----Original Message-----
From: Tim Hill [mailto:tim@realmsys.com]
Sent: Monday, April 18, 2005 6:39 PM
To: Weintraub, David
Cc: 'Subversion Users'
Subject: Re: Another request for obliterate...

Good points. The more I think about this the more I feel that a "prune"
command is really the best compromise. Something like:

    svnadmin prune REPOPATH -r REV PATH ...

Prunes the specified path(s) starting at the specified revision from the
specified repository. Each path (file or directory) is obliterated from the
specified revision, and all subsequent revisions, including all branches
made at or after the specified revision. The change is permanent. Pruning at
the revision where the file was originally added to the repository will
obliterate all traces of the file. Pruning at a branch revision will
obliterate all traces of the file on that branch.

My gut feeling is that this will accomodate 80% of users needs but keep the
model simple enough so that it can actually be used without leading to
disasters.

I've also seen lots of shops where entrie build toolchains are checked-in.
The rationale here is usually broken, and consists of either (a)
over-loading the SCC system as a backup system or (b) a fabled "we need to
be able to reconstruct our build environment". Of course, in the latter
case, you need more than just the toolchain (think OS etc etc.).

Incidentally (OT), I now use virtual machines as a way to maintain build
environments. Just ZIP the whole thing up and put on optical media -- VM,
OS, tools, etc etc.

--Tim

Weintraub, David wrote:

I vote for an obliterate command, but we are talking about two separate
commands "obliterate" and "remove version (rmver)":

* I've been a CM admin for about 15 years, and I find that a user will
request me to obliterate a file about 3 or 4 times per year. Mostly with new
files that were accidentally added and contained sensitive information. I've
never "obliterated" files with substantial histories, and I'd probably
refuse if a user request that I do -- especially if it involves stuff that
was released (either internally or externally).

* I find "rmver" a bit more useful. In ClearCase, developers don't develop
off of the head of the trunk (called /main in ClearCase). Instead, they
create a branch and do their development work on that branch. Once they've
determined that their code works and it is stable, they would merge their
work onto the head of the trunk. (In ClearCase, the trunk was suppose to be
always stable and releasable)

If you look at a version tree of a file, you'll see dozens of branches
merging in and out of the /main branch. To clean up this mess, many places
have a policy of removing "dead" development branches and versions. You
still have most of the versioning information since you're not deleting
anything off of the main trunk. You're only deleting old versions that even
the developers no longer care about. This speeds up many of the scripts we
use (image how long the "blame" command would take if you have a file with
20 versions on the main trunk, and hundreds of versions on a dozen different
side branches) and speeds ClearCase up a bit too. However, it doesn't save
very much room since you're only storing the deltas.

* It is extremely common -- despite what people may claim "best practice"
states -- to put binaries of compiled programs in your archive. This gives
you a single place where developers can get precompiled libraries to develop
against, it gives everyone a single location of a guaranteed to be valid
release, and you know that the System Admins are backing this up on a daily
basis.

The problem is that binaries take up a ton of room. If you're building every
single day, and each build contains 10 to 20 gigabytes of binary data,
you'll fill up a network disk area no matter how big it is. We are
constantly removing old versions of libraries and executables that were
never released. Our policy was to remove all binaries from any "bad build",
anything from a "good build" over two weeks old, and keep any binaries from
an actual release until those binaries are no longer supported.

I would like both versions of the "obliterate" command (obliterate and
rmversion), but then I'd also like a million dollars and a pony. The
"obliterate obliterate" command might be easy to implement if we simply put
on restrictions of what it can obliterate. Maybe a file that has only one
version of itself, is not on any branches or labels, and is still in the
HEAD of the trunk. Maybe something that is in no more than "X" versions of
the archive where "X" is a fairly small number. If you make a booboo and
accidentally put in a file that shouldn't be in the archive, you can ask the
CM to obliterate it, but you better ask pretty quick. Even with those
restrictions, it would cover about 98% of the need for obliterate.

The "rmversion obliterate" command is much, much harder to implement for
reasons I outlined before. You are going to have side effects, and must
determine how you handle those side effects before you even dream about
coding. In ClearCase, we could not (at least easily) remove a version of a
file that had a branch coming out of it, or had a label on it or was
"interesting" in any other way.

But then, ClearCase versioned files and not the entire archive, so doing a
"rmversion" had limited side effects. And, these side effects were well
understood. In Subversion, where the whole archive is versioned, the effects
are much larger and more unpredictable. For example, how could I make sure I
am not accidentally removing an "interesting" version of a file? That is, a
version of the file with a tag/label on it or a file that is at the root of
a branch. In ClearCase, we would prune dead branches and remove all of those
versions. But, we didn't want to remove versions of file with labels (tags)
on them or files that are used for work that is being actively developed.

In Subversion, there is no difference between a branch and a tag except for
what exists in between the ears of the CM. How can we make sure Subversion
knows that a particular version of the file we want to remove isn't
"interesting"?

Right now, I am going to discourage my company from versioning binary files.
We will store binaries on a share and just hope that the SysAdmin is backing
up those areas on a daily basis. As long as we are only storing deltable
files, disk space won't be a major problem.
Received on Tue Apr 19 18:02:09 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.