Conflict resolver callback design

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Fri, 15 Mar 2013 21:01:06 +0000 (GMT)

A long email, for folks interested in conflict resolution.

I have been evaluating our conflict resolution callback support, partly in the course of studying how to provide better conflict resolution, and partly in the course of issue #4316 -- fixing 'merge' so that it continues to the next revision range if you interactively resolve all conflicts.

In this email I analyze the current state of support for the conflict resolution callback design and suggest ways in which it could be improved. I haven't mentioned everything here -- for example the several known problems with the current "conflict description" data type.

At the end I talk about how this impacts a fix for issue #4316.

Don't panic -- I'm not suggesting we should hold up 1.8 for any of this, and I haven't written a concrete proposal for such changes at this point. With that in mind, comments welcome.

INTRODUCTION

We have a conflict resolver callback function ('CB') defined as:

/** A callback used in merge, update and switch for resolving conflicts
   * during the application of a tree delta to a working copy. [...] */
typedef svn_error_t *(*svn_wc_conflict_resolver_func2_t)(
    svn_wc_conflict_result_t **result,
    const svn_wc_conflict_description2_t *description,
    baton, result_pool, scratch_pool);

The rough idea is that the client's callback is called every time a conflict is raised, and can decide whether to postpone it (record it in the WC and move on), or resolve the conflict according to some pre-determined fixed choice or set of rules, or let the user interactively resolve it.

=== Pre-conflict and post-conflict operating modes.

The problem is basically that we have confused two different operating modes -- with different purposes and requirements -- for such a callback:

(1) Sometimes we call the CB before raising a conflict, in order to decide whether we need to record a conflict or can immediately resolve it.

(2) Other times we record a conflict in the WC and then, later, we call the CB as a way of presenting the stored conflict to the client. In this case we want the user to be offered a range of interactive options including examining the conflict and the WC and making changes to the WC.

If we like the "hook" terminology, we could call mode 1 a "pre-conflict hook" and mode 2 a "post-conflict hook". Both of these modes have their uses.

Mode 1 is viable for a text or prop conflict, since the result can reasonably be assumed to be a single file or property value that is to be installed at the current path. With a tree conflict, however, the result is expected to be at least a tree change at the current path, and possibly (especially when moves are involved) a tree change at another path (or paths). A tree change generally requires using the WC APIs to build the result incrementally, not just returning a single object and asking the WC to install it.

In mode 1, the CB needs to return quickly because we are in the middle of an update, switch or merge, with a network connection open, and we haven't yet recorded the conflict or completed the update (or switch or merge) so the new WC state is not complete (for example the incoming change might be half of a move and we might not have received the other half of the move yet). For both of these reasons, it is not appropriate to offer interactive resolution. It may not be feasible to offer all forms of tree conflict resolution, especially where a move is involved.

Advantages of mode 1 include: it avoids sending 'C' notifications; it may result in less CPU and network load. The avoidance of 'C' notifications may be important for improving signal to noise ratio, in large scale usage. It is not necessarily a significant advantage, since the notification receiver should be able to process them and filter them out if required, but that may add unwanted complexity to the receiver.

In mode 2, the update or switch can be complete before we start resolving, or in a merge of multiple revision ranges, at least one revision range can be completely merged.

Mode 2 doesn't need a callback function to implement it: we could instead provide a "pull" interface for the client to request information about the conflicts and

Advantages of mode 2 include: plenty of time for interactive resolution; all the WC state is available to be examined; client can implement whatever logic it likes to resolve a conflict rather than having to provide one of a limited number of answers; can operate on more than one node and so is suitable for tree conflicts and conflicts involving moves.

When the callback is called, these conditions should be well defined ...

- The conflict description provided.
- Whether the WC write lock is held.
- Whether the conflict is already recorded in the WC.
- The WC state of the victim path.
- The WC state of other related paths (such as parent, children, move-to).
- What the callback is allowed to do to the WC (queries, changes) in order to resolve the conflict.

... but currently are not.

ANALYSIS

=== Where is it called?

CB is called during up/sw/merge/resolve.

Passed through 'ctx' to these libsvn_client (public/inter-library) functions:
svn_client_update
svn_client_switch
svn_client_merge
svn_client_resolve

Passed by parameters to these libsvn_wc(public/inter-library) functions:
svn_wc_merge5
svn_wc_merge_props3
svn_wc__get_update_editor
svn_wc__get_switch_editor
svn_wc__get_file_external_editor
svn_wc__resolve_conflicts

up/sw/merge: Library functions call CB only for text and prop conflicts.
resolve: Library functions call CB for all conflicts.

up/sw/merge: Library functions call CB *before* installing a conflict description in the WC DB (mode 1) -- at least for text & prop conflicts; not sure about local-move tree conflicts.
resolve: Library functions call CB *after* installing a conflict description in the WC DB (mode 2).

'svn update' (and 'svn switch') currently uses mode 2. When it calls the libsvn_client update function, it passes in a (mode 1) CB that simply records the paths and postpones the conflict resolution. After the update is complete, it then calls 'resolve' on each recorded conflicted path (mode 2).

'svn merge' currently uses a mixture of mode 1 and mode 2. When interactive resolution is requested, it works the same way 'update' does; otherwise it uses mode 1 for text and prop conflicts and (AFAICS) mode 2 for tree conflicts.

=== Is the conflict description consistent?

up/sw/merge: CB is passed a conflict description.
resolve: CB is passed a *different* description for the same conflict, in some cases (notably, property conflicts).

I modified 'svn' so that it ran a merge using a mode 1 callback (for text and prop conflicts) that records each conflict description, and then ran 'resolve' with a callback that compares each conflict description with the corresponding one that was recorded earlier. There are big differences in the property conflict descriptions. This is a symptom of using different code paths to generate the descriptions.

=== The WC write lock

up/sw/merge: CB is called while there is a WC write lock on the 'victim' path.
resolve: CB is called while there is a WC write lock covering 'all possible paths affected by resolving' the conflicts in the given tree.

=== What may the CB do with the WC?

It's not clear what the CB is allowed to do to the WC.

=== The 'adds_as_modification' flag

Some existing APIs have an 'adds_as_modification' flag, which says don't raise a tree conflict for add-onto-add (if the node kind is the same), but instead resolve the potential conflict by collapsing the two adds and treating the local add as the desired new value, so the result looks like a modification -- or non-modification -- of the incoming added node.

This flag is an example of a "mode 1" pre-determination for certain kinds of conflict. This kind of functionality should be provided by a mode-1 conflict resolution callback.

It is present in: svn_client_update4(), svn_client__update_internal(), svn_wc__get_update_editor().

MERGE WITH INTERACTIVE RESOLUTION

=== Issue #4316 'Merge errors out after resolving conflicts'

The interactive resolution for 'svn merge' is broken: it doesn't go on to merge the remaining revision ranges even if you resolve all the conflicts -- issue #4316.

In order to fix that, I have a patch which makes svn_client_merge*() call the CB for all conflicts raised, after merging each revision range, and then go on to the next revision range. Merge will no longer ever make "mode 1" calls to the CB, always mode 2.

For consistency I would consider doing the same kind of thing to 'update' and 'switch' APIs -- that is, make them call the CB at the end of the operation, for each conflict that was raised, instead the 'svn' client running the update with a 'postpone' CB installed and then running 'resolve' afterwards. I haven't yet planned to do this but it would seem to be the sensible thing.

=== Notifications

This changes the notifications that are produced by a merge when using a fixed conflict resolution option such as '--accept=mine-full'. Now, even these conflicts will result in a 'C' notification being sent to the notification receiver, and later a 'Resolved conflicts on xxx' notification if the CB resolves the conflict. I will attempt to patch the notification receiver in 'svn' to notice if some or all of the reported conflicts were resolved, and at least make the summary of conflicts say something intelligent about that.

NEXT STEPS

At some point after fixing #4316, and subject to feedback, I intend to follow up with a more concrete proposal for how we can revise our API support for the conflict resolution callback, to recognize the difference between mode 1 and mode 2. We should provide two very different callbacks or mechanisms, something very minimal or nothing at all for mode 1, and something rather more flexible than we have for mode 2.

I'm not suggesting any of this should happen before 1.8, but now is the time to say if you think it should.

Thoughts?

- Julian

--
Certified & Supported Apache Subversion Downloads: http://www.wandisco.com/subversion/download

Received on 2013-03-15 22:01:40 CET

This message: [ Message body ]
Next message: Paul Burba: "Re: [RFC] Issue Tracker Cleanup"
Previous message: Stefan KÃ¼ng: "Re: segfault when reverse merging in trunk"
Next in thread: Bert Huijben: "RE: Conflict resolver callback design"
Reply: Bert Huijben: "RE: Conflict resolver callback design"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]