Re: allowing multiple conflicts in storage

From: Julian Foad <julian.foad_at_wandisco.com>
Date: Wed, 31 Mar 2010 19:34:41 +0100

On Wed, 2010-03-31, Stefan Sperling wrote:
> On Wed, Mar 31, 2010 at 01:44:23PM +0100, Julian Foad wrote:
> > The Source of the Incoming Change
> > =================================
> >
> > I suggest something like (? means an optional field):
> >
> > * A diff between nodes that have a location in a repository.
> >
> > left=(repo, rev, relpath, sha1?)
> > right=(repo, rev, relpath, sha1?)
> >
> > # This source description is normally from merge/up/sw.
> > # Full texts are (at least in principle) available.
> >
> > * A Patch File
> >
> > patch-file=(path, (info to identify which part of the
> > patch file applies to this node - e.g. byte
> > offsets or other info)?)
> > reported-left=(repo?, rev?, relpath?)
> > reported-right=(repo?, rev?, relpath?)
> >
> > # This source description is normally from operation 'patch'.
> > # The reported-{left,right} are taken from info in the patch
> > # file if available: e.g. from "+++ foo (r1000)" we can fill
> > # in at least 'rev' and maybe 'relpath'. These cannot be
> > # considered trustworthy even when given, as patch files can
> > # be constructed and edited at will.
> >
> > * A diff between nodes that don't have a repository.
> > # e.g. a merge from a diff between two locally-modified WCs.
> > # We have no need of this yet (not supported) but we should
> > # expect to support this kind of source of change some time.
>
> This sounds very interesting.
>
> Can you try to adjust notes/wc-ng/conflict-storage accordingly?
> The notation you used is different from what is used there,
> so I'm not sure I really understand whether the above is meant
> to supplement the current design or not.
> I'm making some suggestions below but maybe you have better ideas.

OK I've attached a patch with ideas and suggestions. (The above was
meant to be a concept that should be embodied in the design, not a
design in itself.)

> > A Change Affects the Whole Node
> > ===============================
> >
> > It is important to record the whole-node source of the change, as far as
> > possible, even when only one property conflicts, because I want to be
> > able to see what the other properties and text were on {left, right,
> > mine} to help me resolve the conflict. So we must not think of "a
> > property conflict" as being a fundamental thing: the fundamental thing
> > is "a change to this node conflicts with another change to this node".
> > The assumption that all properties are independent from each other and
> > independent from the text is a high-level approximation which the client
> > makes, to be more convenient in simple cases, but not a fundamental
> > truth, and we must not tie the conflict storage design to only
> > supporting this assumption.
>
> I think the whole-node information you want is stored by the OPERATION
> skel I added today (we can still augment the OPERATION skel if you want
> more information in it).

Yup, that looks more like it.

> The OPERATION skel contains information as to what caused the conflict
> (the source), and the TYPE skel records details about the nature of the
> conflict. See notes/wc-ng/conflict-storage.
>
> > In the same way, when we record a tree conflict, we want to record the
> > {left, right, mine} versions of the whole node. We don't have the
> > infrastructure to do this yet - we don't have a pristine store for sets
> > of properties, let alone for directory trees - but we could do in
> > principle and I hope we will do in practice one day.
>
> Recording left,right,mine trees would be nice, yes.
>
> > The Part(s) of the Node Affected By The Change
> > ==============================================
> >
> > Briefly: the WC probably needs to know only "this node is in conflict"
> > or perhaps tree-conflict versus content-conflict, and no more detail
> > than that. By "know" I mean what the WC layer itself needs to
> > understand and use.
>
> In the current model answering "is this node in conflict" means comparing
> the conflict_data column to NULL. To find out what kinds of conflicts
> there are, libsvn_wc can iterate the list of TYPE skels and look at the
> very first atom in each skel.

Great.

> > It needs to be able to *supply* much more detail,
> > of course, to the client layer, but that can (and should) be in a blob
> > whose content is opaque to libsvn_wc. Layering is Good (unless the
> > layers are too many, too thin; but the WC layer can hardly be accused of
> > that). So I think it IS important to have a common part of the info that
> > libsvn_wc understands itself, and a detailed part that libsvn_wc simply
> > reports to libsvn_client and which libsvn_client composes and
> > decomposes.
>
> Then maybe we should split the TYPE-specific and OPERATION skels
> into a separate list, such as:
>
> ((KIND (OPERATION KIND-SPECIFIC)) ((KIND (OPERATION KIND-SPECIFIC) ...)
>
> The WC layer would only ever look at KIND, and not parse the skels any
> further than that.
> The client layer would supply and interpret data in (OPERATION KIND-SPECIFIC),
> setting the "blob" via libsvn_wc functions.

(As I said in my follow-up, I was confused between libsvn_client with
upper layer of libsvn_wc. In my dream world, the logic of "update" is
handled in the client layer as a sibling of "merge".)

> Would this sufficiently fulfill the layering requirement?
> Probably, but there's a slight problem with this scheme.
>
> E.g. I'd like the client layer to be able to ask about a conflict
> on a specific property. Or about the reject conflict caused by a
> specific hunk. This means that the wc layer will need to record some
> sort of identifier (e.g. a property's name) to tell several recorded
> property conflicts apart.

It's all just a question of where you draw the line above/below that API
and all the other ones you need. I don't see how this particular
calling out of property names is special.

> So, we'd have something like this instead:
>
> ((KIND [ID] (OPERATION KIND-SPECIFIC))
> (KIND [ID] (OPERATION KIND-SPECIFIC)) ...)
>
> For types which can only be recorded once, the ID can be ommitted.
>
> In such as scheme the client layer would use a low-level function
> interface like:
>
> svn_wc__add_text_conflict(char *conflict_data)
> svn_wc__add_prop_conflict(char *property_name, char *conflict_data)
> svn_wc__add_tree_conflict(char *conflict_data)
> svn_wc__add_reject_conflict(char *hunk_header, char* conflict_data)
>
> svn_wc__list_prop_conflicts(apr_array_header_t **property_names)
> svn_wc__list_reject_conflicts(apr_array_header_t **hunk_headers)
>
> svn_wc__get_text_conflict(char *conflict_data)
> svn_wc__get_prop_conflict(char *property_name, char *conflict_data)
> svn_wc__get_tree_conflict(char *conflict_data)
> svn_wc__get_reject_conflict(char *hunk_name, char *conflict_data)
>
> svn_wc__del_text_conflict(char *conflict_data)
> svn_wc__del_prop_conflict(char *property_name, char *conflict_data)
> svn_wc__del_tree_conflict(char *conflict_data)
> svn_wc__del_reject_conflict(char *hunk_header, char* conflict_data)
>
> (parameter types above are just for illustration, I know much of them
> could be const, and we'll also need pools etc.)
>
> Higher-level conflict handling functions would then live entirely
> within libsvn_client.

No comment on API ideas.

> Note though that the current libsvn_client/libsvn_wc layering is
> not really clean (e.g. update editor in libsvn_wc, merge editor
> in libsvn_client), which may impose restrictions upon how dumb
> libsvn_wc can really be about conflicts.

Yup.

> > A reject is one way to record that some text change could not be applied
> > because it did not match the text found in the target file. That is
> > logically a text conflict, but recorded in a different way, and with
> > less information about which lines of text go where. It is certainly a
> > conflict of text rather than of properties or of tree structure.
> >
> > So, we could have said:
> >
> > There are five types of conflict *description* (text conflict with
> > full texts, text conflict from a patch, property conflict, tree
> > conflict, and obstruction).
> >
> > But property and tree conflicts can (later) come from a patch, so:
> >
> > There are seven types of conflict *description* (text conflict with
> > full texts, text conflict from a patch, property conflict with full
> > sources, property conflict from a patch, tree conflict with full
> > sources, tree conflict from a patch, and obstruction).
> >
> > But in my opinion that not good. As I said above, it's better to
> > distinguish the source of change from the kind of change as two separate
> > dimensions.
>
> I believe we already distinguish the source of the change from the kind.
> I regard the reject as a separate kind of conflict at the storage layer
> because the information we store about rejects is different than what we
> store about text conflicts.

I think we're just arguing terminology and categorization. We both now
understand that both the "reject" and "text" kinds of conflict
description describe a conflict in the text of the file.

- Julian

text/x-patch attachment: conflict-storage-jaf1.patch

Received on 2010-03-31 20:35:19 CEST

This message: [ Message body ]
Next message: Hyrum K. Wright: "1.6.10 up for testing/signing"
Previous message: C. Michael Pilato: "Re: RFC: Release process amendment (was Re: Vetoing latest issue #3020 fix in 1.6.10)"
In reply to: Stefan Sperling: "Re: allowing multiple conflicts in storage"
Next in thread: Stefan Sperling: "Re: allowing multiple conflicts in storage"
Reply: Stefan Sperling: "Re: allowing multiple conflicts in storage"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]