Re: Copying (branch/tag), dumping, and filtering repository contents

From: Ryan Schmidt <subversion-2007b_at_ryandesign.com>
Date: 2007-09-07 23:42:15 CEST

I'll just reply to a few things:

On Sep 7, 2007, at 11:59, Bicking, David (HHoldings, IT) wrote:

>> Well, the question that didn't make sense was, "If you have
>> history in trunk, how can you just want to extract one branch
>> and keep history which didn't exist there?" But I'm not sure
>> that's the question you asked.
>>
>> I too have wondered what happens if I have, say, a library in
>> my repository, that originated in a different project, and
>> now I want to move that library to its own repository. It
>> seems that's a difficult task. So I probably won't embark on
>> it; I'll probably keep everything in one repository, because
>> it seems nice to do so anyway.
>
> It appears to be impossible in SVN. The most likely scenario for
> extracting one particular branch, all the way to the root, is when one
> wants or needs to pull out the important information and discard the
> chaff. The most likely situation in which this scenario will rear its
> ugly head is when some poor schmuck like me comes in and realizes that
> the repository was not planned, monitored, or otherwise controlled for
> the previous X years. Secondary to that, is the "mistakes were made"
> scenario, or "well, I didn't expect THAT business process change
> when I
> set things up 3 years ago!".

What aspect of having a repository not be planned, monitored or
controlled makes it difficult for you to plan, monitor and control it
now? You can examine the "svn log" to see what has gone on during the
time the repository was not monitored. You can install post-commit
email hooks to inform everybody of every commit from now on. If
things in the repository are not organized the way you like, you can
"svn mv" them into the correct place now.

One possible problem: if you discover content in the repository that
should not be there, for legal reasons or whatever. So far when
you've talked about wanting to delete old projects I've taken it from
a disk space or neatness/organizational standpoint, about which I'll
say more below. But if things have to be removed for legal reasons,
or you've committed a password file that you shouldn't have, then the
feature you're looking for is the nonexistent "svnadmin obliterate"
which many have requested but which still hasn't really reached the
planning stage; see:

http://subversion.tigris.org/issues/show_bug.cgi?id=516

>> I guess you rather asked how to archive away old projects,
>> especially when a library or another project has been
>> branched off of it at some point. And I think the answer at
>> this point is you don't. You just keep it in the repository.
>
> I can deal with that, if I know that's my only reasonable option. I
> still don't like it. One potential problematic scenario is that one
> might have 10 projects, two of which are currently heavily active.
> There may well be a need and desire to keep all 10 projects in the
> repository, but the two active projects are so active that it becomes
> necessary to archive revisions prior to 6 months ago. Would the
> dump/load process eliminate the other 8 projects because they were
> last
> revised 8 months ago, or would there be a single, full revision of
> them
> left in the repository?

It is not usual to archive old revisions. Not in Subversion, anyway.
Subversion makes it difficult for you to do so: You have to take the
repository offline, dump the part you want to keep (which can take
many hours if your repository is large), make a new repository, load
the dumpfile (more hours), remember to copy back in your hooks and
config files, and bring the repository back online. Everyone who had
a working copy needs to throw it away and check out a new one. If
they had uncommitted work in those working copies, they need to
manually move them over to new working copies. Oh, and any locks you
may have had are now gone.

Also, it's also not always advantageous to trim away old revisions.
Sometimes this has the effect of making your repository larger on
disk, not smaller. Subversion stores revisions as deltas against some
previous revision, and also, copies are cheap in Subversion, meaning
that if you have a repository with just a trunk, and it occupies
100MB, and you create 50 branches off of that, your repository now
occupies maybe 100.000001MB, because all those copied branches are
recorded in the repository as a reference to the trunk. As soon as
you start making changes to the branches, then only those changes
start getting recorded to the repository. It's very space-efficient.
But it can cease to be if you try to trim old revisions. Suppose the
50 braches were created in various revisions up to revision 500, and
that some work has been done on the branches such that your
repository now occupies 200MB for the trunk and all the branches. Now
you decide to trim away the first 600 revisions by dumping revisions
600:HEAD and loading them into a new repository. You can do this, but
now all 50 branches have to be recorded as complete expensive copies,
since the revisions of trunk from which they were initially copied
are now gone. So instead of occupying 200MB, your repository now
occupies, say, 51*100MB = 5.1GB on disk.

It sounds like you're envisioning repository maintenance tasks which
are neither necessary nor recommended nor easy. My advice is to
forget that, and just let your repository grow.

>> If you don't want to see that old project in your main
>> repository list, you can "svn mkdir $REPO_URL/old" and "svn
>> mv $REPO_URL/some_old_project $REPO_URL/old" to get it out of
>> the way, but still accessible. If your complaint is with the
>> disk space occupied by the old projects, then the answer from
>> the Subversion team is probably one of their old standards,
>> namely that disk space is cheap. That's not a great answer
>
> Those of us who work in corporate environments have to battle reality.
> We have a situation now where the VSS repository is 10G, and we can't
> get new disk space allocated. There are processes, procedures,
> governances, etc. Eventually it will happen, but in the mean time
> considerably more money will likely be spent dealing with the problem
> than would have been spent with a quick upgrade. Sometimes, the
> complexity of the environment can make a simple volume increase a
> difficult proposition, I suppose.
>
>> sometimes, but I can see where they're coming from. The cost
>> of an extra hard disk or two is less than the cost of
>> developing svndumpfilter further, or implementing svnadmin
>> obliterate, etc.
>
> It's less financially, not necessarily so in business process reality.
> Thank you for your reply.

Of course I can't help with your corporate politics. All I can do is
look at bestbuy.com and see that a 320GB internal drive costs about
USD 120. Get two of those, make a RAID 1 out of them and stick 'em in
a server with a couple free drive bays, and your repository should be
set for awhile. That's a fixed one-time cost, versus the unknown cost
of however much time and effort it would take you to devise
workarounds to use Subversion in ways it wasn't really designed for
(archiving old revisions to save space, say). Consider possible
downtime too -- how much would it cost your organization if all your
developers are unproductive for an hour because they cannot access
the repository server because it ran out of disk space and had to be
taken down for an upgrade? or because your archiving operation took
longer than anticipated? Better put sufficient disk in it to begin
with, IMHO.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Fri Sep 7 23:40:53 2007

This message: [ Message body ]
Next message: Phil: "Re: How to list all repositories?"
Previous message: Hari Kodungallur: "Re: How to list all repositories?"
In reply to: Bicking, David (HHoldings, IT): "RE: Copying (branch/tag), dumping, and filtering repository contents"
Next in thread: Bicking, David (HHoldings, IT): "RE: Copying (branch/tag), dumping, and filtering repository contents"
Reply: Bicking, David (HHoldings, IT): "RE: Copying (branch/tag), dumping, and filtering repository contents"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]