Re: branch tree-conflicts-notify: merge to trunk?

From: Neels J. Hofmeyr <neels_at_elego.de>
Date: Tue, 28 Oct 2008 05:36:20 +0100

Stephen Butler wrote:
> Quoting Julian Foad <julianfoad_at_btopenworld.com>:
[...]
>> I got these failures when I tried the branch a couple of hours ago
>> (r33896):
>> FAIL: externals_tests.py 14: old 'PATH URL' format should ignore peg
>> revisions
>> FAIL: externals_tests.py 16: place a file external into a directory
>> external
>
> I'll throw in a couple of svn resolved command to get rid of the
> inadvertently-created tree conflicts (of the add:add variety).
>
>> FAIL: merge_tests.py 103: merge tries to delete a file of different
>> content
>
> For me test 103 fails intermittently. When it does fail, the expected
> stdout and actual stdout appear to match anyway. Rather strange.
>
> EXPECTED STDOUT:
> Skipped 'A/D/G2/tau'
> --- Merging r2 into 'A/D/G2':
> C A/D/G2
> Summary of conflicts:
> Tree conflicts: 1
> Skipped paths: 1
> ACTUAL STDOUT:
> Skipped 'A/D/G2/tau'
> --- Merging r2 into 'A/D/G2':
> C A/D/G2
> Summary of conflicts:
> Text conflicts: 1
> Skipped paths: 1
> EXCEPTION: SVNLineUnequal
> Traceback [...]

The error here is that in the ACTUAL STDOUT, it says "Text conflicts", not
"Tree conflicts". (That has caught me once before: When quickly comparing
outputs, "Text" and "Tree" is easily mistaken to be the same.) I haven't
changed the test because it calls for deeper fixes.

This is the tiny tip of an iceberg that will require the diff callbacks to
change (but a different change than what diff --summarize needs). The merge
code needs to be tree-conflict aware. The relevant action takes place in the
diff_callbacks, and it is kind of difficult to get the information out of
there: diff callbacks return each conflict state (text, prop) separately via
a pointer argument passed to the callback function. So we need another
pointer argument to return a tree-conflicted state to the diff callbacks'
notification layer (I'm saying that the callbacks signatures need to have
another argument added). BTW, I really don't like the design of the diff
callbacks in that respect. It takes lots of zombie typing to refactor.

Since the returned conflict arguments are of type svn_wc_notify_state_t, I
once thought of adding a svn_wc_notify_tree_conflicted field to the enum,
but that's no good: We have two separate values, content_state and
prop_state, and either one could then be tree conflicted. We'd have to check
both all over, and get confused because the data structure doesn't reflect
the actual orthogonality.

BTW, the branch tree-conflicts-notify can take no more of this. It should go
back to trunk -- *now*.

>
>> FAIL: stat_tests.py 31: status with tree conflicts
>
> Expected status
>
>> FAIL: tree_conflict_tests.py 9: merge file: modify onto not-file
>> FAIL: tree_conflict_tests.py 11: merge file: del/rpl/mv onto not-file
>> FAIL: tree_conflict_tests.py 13: merge dir: modify onto not-dir
>> FAIL: tree_conflict_tests.py 15: merge dir: del/rpl/mv onto not-dir
>
> In all 4 cases, commit of a dir containing a text-conflicted file
> produced no output, not even an error message. Not sure how we broke
> this.

I think I found why. We've changed the code to return the tree-conflicted
state on the victim, but these tests erroneously rely on the previous
behaviour where the conflict is reported on the victim's parent and blocks
that parent.

E.g. tree_conflict_tests.py 9 skips a missing target in a directory (the
directory branch2/ft has no file F in it). That is a tree-conflict and gets
noted:

$ svn status
! C branch2/ft/F

However, merge does not have per-victim notification yet (see way above),
and when above conflict was raised, it showed

$ svn merge <stuff>
C branch2/ft

as in the olden days: a "text conflict" on the parent. Furthermore, the test
expects branch2/ft to be in conflict, but it now isn't anymore. We removed
*HAS_TREE_CONFLICTED_CHILDREN from svn_wc_conflicted_p2(), so the commit
doesn't get a tree-conflict on branch2/ft anymore. The real conflicted item
is branch2/ft/F, which is missing and doesn't get tested by commit in the
first place.

Now, do we have a design problem here? Maybe it *is* necessary to block a
recursive commit on a directory that has *missing* items that remain in
tree-conflict, and thus we *do* need to block a commit in a directory that
*contains* certain tree conflicts? Repeat:
- A commit should only be blocked when the whole directory is committed
recursively. This already happens for nodes that are actually present, but
missing ones currently don't get checked.
- When committing non-conflicted siblings of a tree-conflict victim
explicitly, then that shouldn't block.
So maybe we *do* still need the HAS_TREE_CONFLICTED_CHILDREN as an optional
return value in svn_wc_conflicted_p2().

This is slowly turning into quite a landscape of numerous different levels
of tree-conflicts:

tree-conflicts of relevance on a given node:
- just been newly raised on this node.
- just been newly raised on an ancestor while crawling.
- persisting on this node (from a previous operation).
- persisting on any ancestors.
- persisting on a *missing* child node, when recursing a dir.

Hey, what if I have
svn up
C A
but I do
cd A/sub/dir
svn ci
-- will the commit fail to block because we're not crawling up into the
ancestors of the current working copy directory? Do we have to persist the
complete subtree of a tree-conflicted WC dir as conflicted, even though not
notifying or info-ing about it? There could be a flag saying "yeah, you
can't commit me, but it's one of my ancestors' fault, so you have to fix
stuff further up". (Or is this already accounted for?)

Wow, tree-conflicts are really complicated. There's, like, whole *layers* of
stuff. ;)

>
> Aside from sorting out these tests, we also need to make some new
> tests for detection of existing conflicts, following the general
> scheme that you outline below. Oops, guess I just volunteered. :-o

I thought of kind of changing the DeepTreesTestCase running code to just
repeat whatever it did to raise its tree-conflicts and try to evaluate
(maybe we need to pass in more expected_stuff or maybe there's a simple
generic way of checking whether a persistent tree-conflict is handled
correctly). But I'd be happy to leave this to you or anyone else, as long as
it's not me. ;)

>> I hope it's OK if I do a catch-up merge on your branch. Last Friday (I

Yes, go ahead. :)

>> think) I tweaked the "check_tree_conflict()" function in update_editor.c
>> to return the conflict description it raised, and I committed that to
>> trunk, while you had done it a slightly different way (returning a
>> boolean flag instead) on the branch. Returning the whole conflict should
>> allow us more flexibility to choose how much detail we put into the
>> notification, like I do in the attached patch by sending it to the new
>> function "notify_new_tree_conflict()".

+1, but

> Yes, I suppose returning the whole conflict is the long-term solution.
> I'd like to merge the branch to trunk ASAP, though. Can we do the
> boolean->conflict type change in a separate commit to trunk?

>> Also I think check_tree_conflict() may benefit (sooner or later) from
>> being split into separate "detect" and "raise" functions, with the
>> conflict description being returned by the former and passed to the
>> latter.

+1, as soon as it is needed.

>
> Yes, the tree conflict notification should be encapsulated in a simple
> function.

Now, `raise' and `notify' are entirely different things. `raise' makes the
entry reflect the tree-conflict. `notify' is very specific to the respective
surrounding code in the way that the notification is propagated. So, `raise'
should be a simple function. But `notify' unfortunately can't: e.g. in the
update_editor.c, notify_func() calls are right in between everything else.
That would qualify for a simple tree-conflicts notification function. But
merge, for example, uses the diff callbacks that have the iceberg mentioned
above, so we don't have access to the notify_func and can't call it
ourselves. So I guess it's a show-stopper for combining `raise' and `notify'
in the same function.

There's some wrapping/evaluating done around the notification states
returned by the diff callbacks, and I don't think we can easily change the
diff callbacks to just provide the darn notification function like it was
intended.

>
> Currently the tree conflict notifications are mixed in with other
> notifications, leading to output such as 'D C'. But very soon we'll
> eliminate that mixing, by skipping updates to the victims.

Hey, haven't I already done that? I think I have. (on the branch)

> Then we
> can notify the user of a tree conflict immediately instead of waiting
> for dir_close or file_close (in some cases).

Depends on the caller, s.a..

>
>>
>> Yell if that seems wrong.
>>
>>
>> Now, stages of detection...
>>
>> We could be clearer on exactly what Subversion should be doing during
>> "svn update/switch", with respect to conflict detection and skipping of
>> obstructions.
>>
>> One of you mentioned before that a note about this would make a good
>> code comment in update_editor.c, and I agree.

I was actually meaning that exact structure I saw there. I find below
enumeration quite difficult to filter... Let me see, originally, it was this:

   if target was already in conflict:
     skip this action
   else
   if this action will cause a tree conflict:
     record the tree conflict
     skip this action
   else:
     do this action

and from there, I see that your text aims to more precisely define this
pseudocode, and it adds an "else if obstructed":

>>
>> Does this look right?
>>
>> [[[
>> /*
>> * An "update" or "switch" should include, in this order:
>> *
>> * 1. if # Incoming change affects an item already in conflict:
> a.)
>> * tree/text/prop change to node beneath tree-conflicted dir,
>> or
> b.)
>> * tree/text/prop change to tree-conflicted dir/file, or
> c.)
>> * tree change to a text/prop-conflicted file/dir, or
> d.)
>> * text/prop change to a text/prop-conflicted file/dir[1],
>> * then # Skip this change [2]:
>> * do not update the Base nor the Working
>> * notify "skipped because already in conflict"
>
> I hadn't thought about sub-items 1c or 1d above. Good catch.
>
> So IIUC we'll print one line of output upon detecting an existing conflict
> (1b, 1c, 1d above), or creating fresh tree conflict (#3 below).
>
> I think we should be completely silent for the rest of a conflicted
> tree (1a), to avoid burying the user in lots of "Skipped" messages.
> Too much output would distract the user from the initial message,
> which describes where the tree conflict is. After resolving the tree
> conflict, the user can check status or resume update/switch, and find
> any other conflicts in the tree.

So here we're talking about the tree-conflicts that persist from a previous
operation? Then -1, I'd still like to see (1a) notifications, or at least a
skip notification for the topmost items of (1a) (I mean that it says
"skipped this whole tree because it is inside a tree-conflicted dir", but
would have to report scattered files separately, probably). Otherwise some
possible scenarios would just return without ANY notification at all, not
indicating that it is skipping something due to a persisting tree-conflict.

$ svn <operation1>
C my/dir
$ svn <operation2> (wants to add file `my/dir/subtree/addition')
Skipped my/dir/subtree/addition, because my/dir remains in conflict.

Hm, or maybe it should say something like
$ svn <operation2>
Skipped my/dir, because it remains in conflict.
even though my/dir was *no* explicit target. This could become very
confusing when stepping into my/dir/subtree, calling <operation2> and
reading something about "Skipped my/dir" although it is not below, but above
this directory. Run for cover! Worms!!

>
>
>> *
>> * 2. if # Incoming change affects an item that's "obstructed":
>> * on-disk node kind doesn't match recorded Working node kind,
>> * including an absence/presence mis-match
>> * then # Skip this change [2]:
>> * do not update the Base nor the Working
>> * notify "skipped because obstructed"
>> *
>> * 3. if # Incoming change raises a tree conflict:
>> * tree/text/prop change to node beneath tree-scheduled[3] dir, or
>> * tree/text/prop change to tree-scheduled[3] dir/file, or
>> * text/prop change to tree-scheduled dir/file,
>> * then # Skip this change:
>> * do not update the Base nor the Working [4]
>> * notify "tree conflict"

Here, in contrast, a *newly raised* tree-conflict in a directory should
cause all "skip" messages of sub-nodes to be omitted.

>> *
>> * 4. Apply the change:
>> * update the Base
>> * update the Working, possibly raising text/prop conflicts
>> * notify as in v1.5

So this is the final "else", the "default:" case, the "what we do if nothing
above went wrong", right?

Then I agree with this, though I haven't checked it to the letter, nor
thought deeply about cases that might be missing.

>> *
>> *
>> * [1] Details of which combinations of property and text changes
>> conflict
>> * are out of scope here and defined by v1.5 behaviour.
>> *
>> * [2] Should skip changes to an entire node, as the base revision number
>> * applies to the whole node. Not sure how this affects attempts to
>> * handle text and prop changes separately.
>> *
>> * [3] Not all combinations implied can happen, but that doesn't matter.
>> *
>> * [4] For now, we skip the update, and require the user to:
>> * - Modify the WC to be compatible with the incoming change;
>> * - Mark the conflict as resolved;
>> * - Repeat the update.
>> * Ideally, it would be possible to resolve any conflict without
>> * repeating the update. To achieve this, we could store the possible
>> * new base(s) at conflict detection time, and update the actual
>> base at
>> * the time of resolving.

I don't really understand this. It sounds quite simple, but I can't
visualize much detail.

>
> All in all this looks good to me.

Yeah.

>
>>
>>
>> I noticed considerable inconsistencies in this, in the current "update"
>> code, while trying to make "update" skip changes inside a
>> tree-conflicted directory tree. I haven't worked through trying to
>> implement the above, I'm just telling you that's somthing I'm looking
>> at, as I think we need to get this behaviour clearly specified.
>
> Yes the current update code works only for fresh tree conflicts. It's
> definitely incomplete.

Yes. I've only done the "fresh conflicts" part so far.

>
> Here's my basic implementation plan:
>
> The following update callbacks need to check for existing conflicts:
>
> open_directory
> open_file
> add_directory
> add_file
> delete_entry
> absent_file
> absent_directory
>
> In each callback, we'll make the following checks/calls.
>
> (eb->inside_a_tree_conflict)
>
> True if we're walking a tree-conflicted tree. Works even if the
> target and its parent dir do not exist (e.g., we've skipped adding
> the parent dir due to the tree conflict). Skip target silently if
> true.

Nice. We could actually add an eb->inside_a_tree_conflict_was_notified or
something, which the topmost conflict cause sets to TRUE if it notified
about the conflict. Then, when having a subtree as target, this is FALSE and
indicates that we should still notify about a skip.

>
> svn_wc_conflicted_p2()
>
> Check for existing conflicts. Requires a versioned parent dir for
> a tree conflict. Requires a versioned parent dir and versioned
> target for text or prop conflicts. Skip target with notification
> if conflicts are found.
>
> check_tree_conflict(&tree_conflict, eb, ...)
>
> Returns a fresh tree conflict if the target is a victim. Requires
> a versioned parent dir. If a conflict is found, record it
> persistently.

So, this may eventually split into detect_- and raise_tree_conflict().
Although I'd rather call that persist_tree_conflict() instead of raise_,
raise sounds too much like "notify and bail out", both of which it doesn't.

>
> notify_tree_conflict(tree_conflict, eb, ...)
>
> If there's a tree conflict and an eb->notify_func, notify the user.

That will probably not work, at least for merge code, see above. Although,
then again, we might call notify_tree_conflict() in the diff_callbacks
driving structure. Then again then-again, `svn diff' doesn't *need* that at
all. And it's odd to add something to the "diff_callbacks" that is outside
the scope of a diff altogether. Or are we going to hint at tree-conflicts in
a diff as well? I don't think so. Maybe these callbacks have the wrong name.

Thanks guys for the good collaboration, I'm benefiting greatly from your posts.

I was planning to merge to trunk now but reviewing these issues took too
long. Steve, if you like you can merge tree-conflicts-notify to trunk,
otherwise I'll do that tomorrow (or as soon as possible).

This must be the longest consecutive time that I've written on a single
email, ever. "It's so long because I didn't have the time to make it shorter".

~Neels

application/pgp-signature attachment: OpenPGP digital signature

Received on 2008-10-28 05:37:00 CET

This message: [ Message body ]
Next message: David Glasser: "Re: inotify, etc"
Previous message: Justin Erenkrantz: "Re: inotify, etc"
In reply to: Stephen Butler: "Re: branch tree-conflicts-notify: merge to trunk?"
Next in thread: Julian Foad: "Re: branch tree-conflicts-notify: merge to trunk?"
Reply: Julian Foad: "Re: branch tree-conflicts-notify: merge to trunk?"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]