[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Copying (branch/tag), dumping, and filtering repository contents

From: Neil Martinsen-Burrell <nmb_at_wartburg.edu>
Date: 2007-09-14 17:28:12 CEST

On Fri, Sep 14, 2007 at 08:59:18AM -0400, Bicking, David (HHoldings, IT) wrote:
> Thank you very much for this sumary of the three versions of the
> svndumpfilter alternatives. If I read this correctly, these programs
> essentially truncate history at the point of "copy-in". Would it be
> particularly problematic if such a program instead kept track of those
> copied-in items, and changed the path of the prior versions to match the
> most recent copy? I proposed this earlier
> (http://svn.haxx.se/users/archive-2007-08/0435.shtml). I haven't
> examined the dumpfile format in detail, but I think it can be done. I
> have it on my list of ideas for when I have time to do some OSS
> development. It seems to me to be a potentially valuable feature for
> larger organizations that would like to keep a few choice pieces of very
> large repositories at the cost of potential duplication of some
> historical files.

You are correct. All of the above cut off history at the point of
copy-in. I do believe that it might be possible to modify those tools
to keep some history before the copy-in. Possible problems in the
include:

1. You have to add the entire directory structure containing the file
that was copied-in. The added item must live in some node in the tree
so you have to include the directory that it lives in. This is true all
the way up the file hierarchy. You could rewrite the location of the
copied-from file, but you have to do that consistently throughout the
previous history. Of course, this is true for and copy-ins that take
place in the history of the fopied-from file.

2. You will be changing the contents of individual commits (not a big
deal, since you always run that risk) by leaving out, say, the other
files added in a commit that added your copied-from file.

3. You need to go through the dumpfile backwards, unlike the filters
mentioned above, and keep track of all the things that you are tracking
back in time in addition to the things you are keeping because of your
filter.

> Of course, some heuristics could be used to determine the case where
> multiple streamlined paths reference the same history that is outside
> the filter, streamline it into one, and "copy" that into the others.
> This would reduce the incidence of "duplicate history". For
> particularly large repositories where one wants to grab say six or seven
> paths, a "path list file" could be referenced as a parameter to specify
> the paths desired. I think this would give one exactly the desired
> projects, with minimal bloat, while automagically reorganizing the
> versions into desired paths. Essentially, this is a repository
> reorganization and compaction.

Someone who wished to scratch this itch certainly could do it. I am not
a commercial software developer, so I may have a different feeling about
the importance of history, but for me the history of the file before it
was copied into my path is not of great value. If that history is of
*significant* value, then it would be worth it to create a tool that
could do this. Peace,

-Neil

-- 
Neil Martinsen-Burrell
nmb@wartburg.edu
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Sun Sep 16 20:15:22 2007

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.