Symmetric Merge -- status
From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Fri, 25 May 2012 17:08:44 +0100 (BST)
Hi, all.
I want to update you all on the "symmetric merge" [1] status and my plans, and invite your thoughts and any assistance you can give.
I'll be presenting this subject at Elego's SvnDay [2] and at WANdisco events in October, but the presentation will be aimed at users and so will concentrate on how the end result is better for the user and won't say much about the details I'm talking about below.
GRAND PLAN
Two main phases.
Phase 1 (now).
Implement in terms of "sync" and "reintegrate". Accept their limitations; that is: - Any "simple" [3] merge will work fine (in either direction).
- A non-"simple" merge can be performed only in the same direction as the previous merge.
- Make Subversion use "symmetric merge" automatically for any merge request that we currently handle as a "sync" -- that is when:
- it's a forward merge
- no revision range is given (at least, no starting revision; an ending revision is acceptable)
- "--reintegrate" is not specified
- For testing purposes, we also make Subversion use the "symmetric merge" whenever the test suite requests a "reintegrate". I don't see any reason to make it do that for users; indeed I think it would be bad to make this special option start doing things it didn't do before.
Phase 2 ("later"):
Rewrite more of the merge code to alleviate limitations -- to be able to skip cherry-picks, support mixed-rev etc. when merging in either direction.
- Make the implementation more symmetric. This involves pretty deep changes in the merge code, so much so that I think this task would best be combined with a significant revision of the internal merge data structures (svn_mergeinfo_t and so on). Maybe even combined with a revision of the way mergeinfo is stored.
Concentrate on getting phase 1 complete and releasable. I *think* it is nearly done. (See TESTING, below.) The implementation
mimics "sync" or "reintegrate" depending on where it finds the most
recent base, according to the rule that "sync" should be used when
merging again in the same direction as last time, and "reintegrate" when
merging in the opposite direction. It
doesn't matter that this implementation has all the
limitations of the current "reintegrate" merge when changing direction,
because that's already as good as 1.7 for all 1.7-supported cases
AFAIK. The benefit of it just Doing The Right Thing for simple merges, enabling repeated to-and-fro merging, seems huge.
Phase 2
is much lower priority and much more a SMOP, with less impact on users
(documentation etc.). Phase 2 will bring flexibility that isn't of great importance to users AFAIK, since cherry-picking and subtree merging is most often used alone -- on a divergent branch (that's not going to be reintegrated) -- and rarely on a convergent branch (which is going to be reintegrated, so to-and-fro merging is likely). And phase 1 already enables cherry-picks etc. to be accomodated to some extent.
CONCERNS
The main concerns a couple of months ago were that it wasn't handling subtrees and mixed-rev WCs and so on. I believe now that it does (in the "sync" direction -- that is, whenever merging in the same direction as the previous merge). There are a few tests failing (see below) but from a design and implementation point of view I am confident that it should support these cases and that these failures must be due to relatively minor issues.
TESTING
Current test suite:
The following tests fail when merge-cmd.c is patched to call "symmetric merge" for sync and reintegrate merges [4]:
FAIL: merge_reintegrate_tests.py 10: merge --reintegrate with subtree mergeinfo
FAIL: merge_tests.py 78: dont merge revs into a subtree that predate it
FAIL: merge_tests.py 88: subtree merges dont cause spurious conflicts
FAIL: merge_tests.py 89: target and subtrees need
nonintersecting revs
Clearly there's something up with subtree merges, but, as I said above, I have reason to believe that it's not fundamentally broken or unsupported.
New tests:
I've started "merge_symmetric_tests.py".
Are any new tests required to ensure existing scenarios aren't broken, that may not be tested yet?
- The "keep-alive dance". For completeness, we should check how that will behave, as we can assume some people will have adopted practices that incorporate it. I do NOT think we should continue to support that work-around: if it continues to behave as now, that's fine, and if its behaviour changes, I expect to be able to claim that as an intentional behaviour change for the better. That is, to be plain, if anyone's relying on that, they may need to change their practice.
- The tests I have at the moment are pretty small cases. It would be good
to create a test that exercises a series of much bigger merges.
- Any other scenario?
PERFORMANCE
- Performance (in terms of network traffic, in particular). After a series of (same direction or to-and-fro) merges, is the cost of the base-finding algorithm proportional to time since the YCA of the branches (i.e., ever increasing), or
only proportional to time since last merge?
- Any other performance concern?
Please let me know any thoughts you have. And if you might be able to take on the investigation of one of the test failures, or writing a new test (whether pseudo-code or actual code), or checking the performance, that would be awesome.
- Julian
[1] <http://wiki.apache.org/subversion/SymmetricMerge>
[2] <http://www.elego.de/svnday2012>
[3] Define a "simple" merge as one that does not find any subtree merges, cherry picks, mixed-rev/switched/sparse WC.
[4] Attached patch, "use-symmetric-merge-1.patch", makes all (?) sync and reint merge requests use the 'symmetric' code.
|
This is an archived mail posted to the Subversion Dev mailing list.
This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.