[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: allowing multiple conflicts in storage

From: Stefan Sperling <stsp_at_elego.de>
Date: Wed, 31 Mar 2010 15:38:38 +0200

On Wed, Mar 31, 2010 at 01:44:23PM +0100, Julian Foad wrote:
> The Source of the Incoming Change
> =================================
> I suggest something like (? means an optional field):
> * A diff between nodes that have a location in a repository.
> left=(repo, rev, relpath, sha1?)
> right=(repo, rev, relpath, sha1?)
> # This source description is normally from merge/up/sw.
> # Full texts are (at least in principle) available.
> * A Patch File
> patch-file=(path, (info to identify which part of the
> patch file applies to this node - e.g. byte
> offsets or other info)?)
> reported-left=(repo?, rev?, relpath?)
> reported-right=(repo?, rev?, relpath?)
> # This source description is normally from operation 'patch'.
> # The reported-{left,right} are taken from info in the patch
> # file if available: e.g. from "+++ foo (r1000)" we can fill
> # in at least 'rev' and maybe 'relpath'. These cannot be
> # considered trustworthy even when given, as patch files can
> # be constructed and edited at will.
> * A diff between nodes that don't have a repository.
> # e.g. a merge from a diff between two locally-modified WCs.
> # We have no need of this yet (not supported) but we should
> # expect to support this kind of source of change some time.

This sounds very interesting.

Can you try to adjust notes/wc-ng/conflict-storage accordingly?
The notation you used is different from what is used there,
so I'm not sure I really understand whether the above is meant
to supplement the current design or not.
I'm making some suggestions below but maybe you have better ideas.

> A Change Affects the Whole Node
> ===============================
> It is important to record the whole-node source of the change, as far as
> possible, even when only one property conflicts, because I want to be
> able to see what the other properties and text were on {left, right,
> mine} to help me resolve the conflict. So we must not think of "a
> property conflict" as being a fundamental thing: the fundamental thing
> is "a change to this node conflicts with another change to this node".
> The assumption that all properties are independent from each other and
> independent from the text is a high-level approximation which the client
> makes, to be more convenient in simple cases, but not a fundamental
> truth, and we must not tie the conflict storage design to only
> supporting this assumption.

I think the whole-node information you want is stored by the OPERATION
skel I added today (we can still augment the OPERATION skel if you want
more information in it).

The OPERATION skel contains information as to what caused the conflict
(the source), and the TYPE skel records details about the nature of the
conflict. See notes/wc-ng/conflict-storage.

> In the same way, when we record a tree conflict, we want to record the
> {left, right, mine} versions of the whole node. We don't have the
> infrastructure to do this yet - we don't have a pristine store for sets
> of properties, let alone for directory trees - but we could do in
> principle and I hope we will do in practice one day.

Recording left,right,mine trees would be nice, yes.

> The Part(s) of the Node Affected By The Change
> ==============================================
> Briefly: the WC probably needs to know only "this node is in conflict"
> or perhaps tree-conflict versus content-conflict, and no more detail
> than that. By "know" I mean what the WC layer itself needs to
> understand and use.

In the current model answering "is this node in conflict" means comparing
the conflict_data column to NULL. To find out what kinds of conflicts
there are, libsvn_wc can iterate the list of TYPE skels and look at the
very first atom in each skel.

> It needs to be able to *supply* much more detail,
> of course, to the client layer, but that can (and should) be in a blob
> whose content is opaque to libsvn_wc. Layering is Good (unless the
> layers are too many, too thin; but the WC layer can hardly be accused of
> that). So I think it IS important to have a common part of the info that
> libsvn_wc understands itself, and a detailed part that libsvn_wc simply
> reports to libsvn_client and which libsvn_client composes and
> decomposes.

Then maybe we should split the TYPE-specific and OPERATION skels
into a separate list, such as:


The WC layer would only ever look at KIND, and not parse the skels any
further than that.
The client layer would supply and interpret data in (OPERATION KIND-SPECIFIC),
setting the "blob" via libsvn_wc functions.

Would this sufficiently fulfill the layering requirement?
Probably, but there's a slight problem with this scheme.

E.g. I'd like the client layer to be able to ask about a conflict
on a specific property. Or about the reject conflict caused by a
specific hunk. This means that the wc layer will need to record some
sort of identifier (e.g. a property's name) to tell several recorded
property conflicts apart.

So, we'd have something like this instead:


For types which can only be recorded once, the ID can be ommitted.

In such as scheme the client layer would use a low-level function
interface like:

svn_wc__add_text_conflict(char *conflict_data)
svn_wc__add_prop_conflict(char *property_name, char *conflict_data)
svn_wc__add_tree_conflict(char *conflict_data)
svn_wc__add_reject_conflict(char *hunk_header, char* conflict_data)

svn_wc__list_prop_conflicts(apr_array_header_t **property_names)
svn_wc__list_reject_conflicts(apr_array_header_t **hunk_headers)

svn_wc__get_text_conflict(char *conflict_data)
svn_wc__get_prop_conflict(char *property_name, char *conflict_data)
svn_wc__get_tree_conflict(char *conflict_data)
svn_wc__get_reject_conflict(char *hunk_name, char *conflict_data)

svn_wc__del_text_conflict(char *conflict_data)
svn_wc__del_prop_conflict(char *property_name, char *conflict_data)
svn_wc__del_tree_conflict(char *conflict_data)
svn_wc__del_reject_conflict(char *hunk_header, char* conflict_data)

(parameter types above are just for illustration, I know much of them
could be const, and we'll also need pools etc.)

Higher-level conflict handling functions would then live entirely
within libsvn_client.

Note though that the current libsvn_client/libsvn_wc layering is
not really clean (e.g. update editor in libsvn_wc, merge editor
in libsvn_client), which may impose restrictions upon how dumb
libsvn_wc can really be about conflicts.

> A reject is one way to record that some text change could not be applied
> because it did not match the text found in the target file. That is
> logically a text conflict, but recorded in a different way, and with
> less information about which lines of text go where. It is certainly a
> conflict of text rather than of properties or of tree structure.
> So, we could have said:
> There are five types of conflict *description* (text conflict with
> full texts, text conflict from a patch, property conflict, tree
> conflict, and obstruction).
> But property and tree conflicts can (later) come from a patch, so:
> There are seven types of conflict *description* (text conflict with
> full texts, text conflict from a patch, property conflict with full
> sources, property conflict from a patch, tree conflict with full
> sources, tree conflict from a patch, and obstruction).
> But in my opinion that not good. As I said above, it's better to
> distinguish the source of change from the kind of change as two separate
> dimensions.

I believe we already distinguish the source of the change from the kind.
I regard the reject as a separate kind of conflict at the storage layer
because the information we store about rejects is different than what we
store about text conflicts.

Received on 2010-03-31 15:39:16 CEST

This is an archived mail posted to the Subversion Dev mailing list.