SVN 1.5 Status -- and merge tracking

From: Mark Phippard <markphip_at_gmail.com>
Date: 2007-11-15 20:18:15 CET

It seems like overall things are stabilizing -- unless we are avoiding
the issue tracker more lately. There have not been as many new issues
added to the 1.5 milestone in the last couple weeks as there had been
previously. The overall pace of change is slowing. Those are good
signs that we are on the final path towards the release.

That being said, there are some very big changes underway related to
merge tracking. Let me do my best to tell the story.

WARNING: rambling summary to follow!

A few weeks to a month ago there were a few nagging issues related to
merge tracking remaining.

* svn cp WC WC needed to contact the repository to gather mergeinfo.
We currently have a -g option to enable this. No one feels good about
the option or the command contacting the repository.

* we wanted a way for people to dump/load or run some kind of script
that would populate some mergeinfo on existing repositories based on
the copy activity that had happened in the past.

* we later realized svndumpfilter needs to deal with mergeinfo

* cyclic merges are not handled well. (Synch a feature branch with
trunk, and eventually want to merge back to trunk)

Another one (#2953) that came up around this time was that cmpilato
noticed that merge tracking did not really do a good job of
normalizing merge sources based on the repository history. In his
words:

"Subversion's merge tracking (and merge auto-calculation) code is not
normalizing
merge source information to real repository locations -- that is, paths and
revisions for which the FS API function svn_fs_check_path() would not indicate a
missing object."

He did a good job adding some new API and hooking this into the code
and the issue was closed. (I'll come back to this in a bit though)

So with Mike in the mode of thinking about the repository and all the
information it carries, and the problems I mentioned previously, he
had the idea that merge tracking ought to use this repository history
to determine implicit mergeinfo. And it just so happens that he had
just created a bunch of API that would give this information.

This would mean when you do things like copy, we would not need to
create a mergeinfo property at all. The merge feature would get this
"implicit mergeinfo" from the repository and what it knows about an
items ancestry. The mergeinfo property would simply need to record
actual merge activity. It would also mean that pre-1.5 repositories
would just automatically have implicit mergeinfo when they were
connected to a 1.5 server. He is off working on this right now in a
branch.

There are of course some ramfications to all this. I think a lot of
them, such as removing some of the stuff from copy are fairly obvious.
I will just point out some of the "new" ones I can think of:

1) We almost certainly need to start recording more explicit
information about "negated merges". Think of the scenario that you
create a branch from trunk, and then use reverse merge to remove some
of the revisions that came from trunk. You need to record that you do
not have these revisions in the mergeinfo (since the implicit
mergeinfo will say you have them). There is an issue for this.

2) I know there is more than this ... can't think of it at the moment.
Obviously anything that works with mergeinfo is somewhat impacted.

3) It turns out the changes Mike made for #2953 have some negative
performance implications. With current trunk, a URL to URL copy in
the Apache repository could run for an hour or more. A user on IRC
reports that a trunk with 400k files crashes his svnserve on a URL to
URL copy. There is an index in BDB that can make this run quickly,
glasser is looking into doing that in SQLite so that fsfs and BDB can
both share it.

Given that #2953 is essential to have any kind of correctness in merge
tracking, we need to leave the code in place and hope indexing will
speed it up. However, another benefit of the work that Mike is doing
in the branch is that copy will go back to the way it worked in 1.4
and will not have to do this work. Instead, it will shift to merge.
Mike says that merge does not have to hit this information as hard as
copy, so it will not be as heavily hit by this as copy was. Of
course, while we do not want merge to be slow, it is a lot more
acceptable for merge to be slow than it is for copy. Finally, if the
work glasser is doing pans out, we could get the correctness AND good
performance.

So we have a little bit of a setback while we work these issues out.
The end result is that merge tracking should be more correct than it
was, and this will also solve all of these other issues that had been
nagging us. Plus we will get the bonus that existing user's
repositories will gain a degree of immediate merge intelligence.

Finally, the other issue I mentioned about merge tracking was cyclic
(also called reflective) merges. See #2897 for details. Karl and
Kamesh have been working on this and have a plan of attack in place.
It involves a minor schema change to the SQLite index. The current
Subversion repository design will never completely support this merge
scenario, but when Kamesh and Karl are done it should work a lot
better than it does now.

These issues are obviously not all that needs to be done to wrap up
1.5, but they are likely on the critical path for having the code
ready for RC1. I am not sure how much any of this work can be
parallelized if you want to help. Getting a node-origin cache that
works for fsfs and BDB seems high priority so that we will have decent
performance once all of this is done. I am sure general code review
and commenting on the ideas that have been recorded in the issues
would also help.

-- 
Thanks
Mark Phippard
http://markphip.blogspot.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Received on Thu Nov 15 20:18:35 2007

This message: [ Message body ]
Next message: Julian Foad: "Re: svn commit: r27752 - trunk/subversion/include"
Previous message: Ben Collins-Sussman: "Re: Copy/move-handling on update in 1.5"
Next in thread: C. Michael Pilato: "Re: SVN 1.5 Status -- and merge tracking"
Reply: C. Michael Pilato: "Re: SVN 1.5 Status -- and merge tracking"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]