On Thu, 2007-11-01 at 10:41 -0400, Augie Fackler wrote:
> It doesn't actually totally hork up if you do a rename and an edit. My
> friend that uses git for *everything* says that it uses a heuristic to
> determine file ancestry across renames. Supposedly it works really
> really well. I've not used git personally, so I can't offer much
> commentary on how well this works in practice. What I've heard is that
> they track renames and copies about as well as Subversion.
Okay, here's the kool-aid post:
http://permalink.gmane.org/gmane.comp.version-control.git/217
The git "plumbing" doesn't track renames or copies or even relationships
between different versions of files at the same pathname. It has enough
information to construct the tree content for each revision, and
everything beyond that is derived. Tom Lord would definitely
disapprove; this is an extreme form of the "tree-oriented" repository
model. The packed form of the repository might use deltas between
objects which happen to have the same pathname in two different versions
of a tree, or it might use something deltas between two objects at
different pathnames which weren't related at all except that they
happened to generate a nice delta. (I have no idea how it finds these
efficient diffs; not going to worry about that now.)
A positive consequence from the Linux kernel developers' point of view
is that you can import a series of .tar.gz files (of, say, the Linux
kernel before it was under version control) and your repository model is
as high-quality as if you've been using git from day one. You can apply
changes with a dumb (about trees) tool like patch(1) and as long as you
get to the right tree state, you're golden. It's not clear how many
other projects care about supporting that kind of workflow, but maybe
some do. Certainly, it's a model which will integrate well with a wide
variety of tools that don't know jack about version control.
The "porcelain" is then responsible for deriving file movement at query
time or merge time. More generally, you can try to derive *content*
movement, such as when a block of code moves from one file to another
file, or is refactored from several places into one.
>From a theoretical viewpoint, I'm not sold. I don't believe you can
create reliable rename heuristics based purely on tree state. A simple
query operation like "show me this file's history and don't stop at
renames" seems likely to erroneously bottom out a lot, or erroneously
show you a file that wasn't actually part of the history at all. And a
merge in the presence of tree reorganizations (on either side) seems
prone to catastrophic failure. That's certainly no worse than
Subversion--but when Subversion's poor tree merging is a major selling
point for alternative version control systems, I think you want to do
better than you can with heuristics.
>From a practical viewpoint, I'm curious about how use cases like
Pidgin's would have panned out. Pidgin (aka Gaim at the time) converted
their CVS tree to svn, started doing a bunch of file renames which they
wished they could have done a long time ago in CVS, ran into merge
spaghetti because svn can't merge across tree reorgs, and ditched svn
for Monotone in pretty short order. When they run into technical
hurdles with Monotone like dodgy Trac integration, they don't look back
because "at least it's not svn." Would Git's heuristics have done a
good enough job to prevent them from developing such a deep antipathy
towards the tool, or would it have been the same experience?
You can find happy satisfied users of any established version control
tool or it would quickly become disestablished, but that usually just
means they haven't tried to do anything the tool doesn't do well, or
came in with a good understanding of the tool's limitations.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Nov 1 16:52:14 2007