[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Announcing reposurgeon, and requesting fast-import support.

From: Eric Raymond <esr_at_snark.thyrsus.com>
Date: Tue, 9 Nov 2010 12:17:50 -0500 (EST)

Some months back I contributed svncutter to Subversion. This was a tool
for doing surgery on dumpfiles intended to remove artifacts associated with
conversions from older VCSes.

My interest in tools for repository surgery has continued, and I recently
spotted an opportunity in the increasing use of git-fast-import streams
as a history-interchange format. I have written what I believe is the first
*native* application for fast-import streams, a repository editor I
call reposurgeon.

You can read the announcement here: http://esr.ibiblio.org/?p=2718

Project resource page with tarballs: http://www.catb.org/~esr/reposurgeon/

Freshmeat page: http://freshmeat.net/projects/reposurgeon

HTML manual: http://www.catb.org/~esr/reposurgeon/reposurgeon.html

Perhaps the most interesting thing about reposurgeon is that, by
design, it knows almost nothing about any individual VCS. All it
counts on is the ability to get a fast-import dump from a repo and
then the ability to create a repo from the dump after the contents of
the import stream has been modified.

If you hadn't heard about this before, it's because the project is in
alpha and only two weeks old. Nevertheless, it is already good enough
for production use on git repositories. Operations supported include
editing of commit and tag metadata, deletion of commits, expunges of
file history, coalescing single-file commit cliques with identical
comments, and topological cut. The code is backed by an extensive
regression-test suite and fully documented.

I also have working support for bzr and hg, though the practical utility
of same is presently limited by unstable and poorly-supported export/import
tools. I'm working with a bzr dev to address this problem; better solutions
should be forthcoming within weeks, if not days.

Which brings me to my feature request. Please add native support for
fast-export and fast-import to svndump. This would be a good idea
in general, but my specific reason for wanting it is to enable
reposurgeon to edit Subversion repositories.

The export side is, of course, almost trivial. Proof of concept under
MIT license is here: <http://c133.org/code/svn-fast-export.c>. It
needs a bit of extension work around tags and branches; I won't
belabor the obvious (and easily solvable) issues with those. There are
two more substantive ones:

1) Whatever merge-tracking hair you represent internally should be dumped
'as 'merge' commit properties.

2) User commit properties (e.g. those not in the svn: namespace)
should be exported using the bzr properties extension, which
reposurgeon handles now and which seems likely to make it into git core at
some point. Syntax:

   property <space> NAME <space> VALUE-LENGTH <space> VALUE LF

or, if the value is empty:

   property <space> NAME LF
 
NAME and VALUE are utf8-encoded. The properties for each commit are sorted
by the property name.

Also note that an import stream actually containing commit-property declarations
should have a line reading "feature commit-properties" before the first commit.

The import side is less trivial, but given that you've already got internal
representations for merge-tracking it shouldn't be too difficult either.

I'd offer to do this, but I'm deliberately staying away from writing
export/import code myself, other than the implementations inside
reposurgeon. It will be better, long-term, if my reposurgeon
assumptions don't leak into other implementations; they ought to be
engineered from the fast-import stream documentation. See the
definitive web page at:

<http://www.kernel.org/pub/software/scm/git/docs/git-fast-import.html>.

Finally, I will note that I think this feature could be significant
for Subversion's competitive posture. Because exporters are easy while
importers are more difficult, supporting import streams only with
exporters and only through sketchy third-party tools tends to
encourage migration to git while discouraging migration away from it.

Other VCSes, with bzr taking point, are positioning themselves as
destinations rather than places to leave by mainlining importers. As
a friend of Subversion, I strongly recommend that it should do
likewise.

-- 
		Eric S. Raymond
A human being should be able to change a diaper, plan an invasion,
butcher a hog, conn a ship, design a building, write a sonnet, balance
accounts, build a wall, set a bone, comfort the dying, take orders, give
orders, cooperate, act alone, solve equations, analyze a new problem,
pitch manure, program a computer, cook a tasty meal, fight efficiently,
die gallantly. Specialization is for insects.
	-- Robert A. Heinlein, "Time Enough for Love"
Received on 2010-11-09 18:18:26 CET

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.