[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Merge mode (Was Classifying files as binary or text)

From: B. Smith-Mannschott <bsmith.occs_at_gmail.com>
Date: Wed, 18 Nov 2009 12:31:28 +0100

On Wed, Nov 18, 2009 at 11:55, Julian Foad <julianfoad_at_btopenworld.com> wrote:
> Julian Foad wrote:
>> I need to spend some time replying, late at night though it is.
>> Let me try to explain why I think a "how to merge" property should not
>> be the primary indicator of how subversion should merge each file.
> Some comments on my own posting...
>> Principle
>> =========
>> I have read that, in the realm of data handling, there is a principle
>> that it is a bad idea to tag data with annotations that say what kind of
>> actions can or should be performed on it. That kind of coupling is
>> unscalable. Instead, it is better to tag data with an indication of what
>> meaning and/or what syntax the data has, and then let tools decide what
>> to do, based on that information.
>> We already have one data-type indicator: svn:mime-type. Now, MIME type
>> is far from a complete data type specifier. It is insufficient for our
>> needs, in theory. However, in practice, it is nearly sufficient. (See
>> Problem 2 below for an exception.)
>> We also have another data-type indicator: the file name. A file name is
>> also an incomplete source of metadata, and some file names ("README" or
>> "CHANGES") give no indication at all of the format, but it is useful in
>> many cases ("*.py", "*.c").
> We also potentially have a third indicator that can be useful: a scan of
> the content to determine whether it is mostly plain text (implemented by
> the mis-named svn_io_detect_mimetype()).
>> Problem 1 (limited recognition of MIME types)
>> =========
>> Subversion mis-categorizes a lot of MIME types as "binary" (and
>> therefore will not merge or diff or blame them) which really are
>> line-based text formats.
>> The list of such MIME types is continually evolving so it is not
>> possible for Subversion to have a built-in complete list. However, it is
>> easy for new releases of Subversion to have an updated list.
> I wonder if I have missed some good reason not to update the built-in
> list of MIME types that are considered "text". Just a constraint on the
> available volunteer effort? I know it's not the 100% solution but it is
> a 99% solution and UI am surprised we are not automatically starting
> here.

(1) It's an ongoing maintenance headache.
(2) The merge behavior of existing repositories will change,
surprising some, no doubt.
(3) *The mapping from MIME-type to 'mergeable' is ambiguous.*
Consider, again, application/xml. No amount of "updating the built-in
list of MIME types" is going to help here.

>> It is not much harder for Subversion to have a configuarable list of
>> which MIME types (or MIME type patterns) should be considered mergeable.
>> (The configuration could be extensible: it could say line-wise-mergeable
>> or not mergeable or XML-mergeable or ...)
> [...]
>> Solution 0 (merge-mode)
>> ==========
>> So we could add a property to each file which says whether the file is
>> to be considered line-wise mergeable by Subversion, and say that this
>> property will be the primary source of this information. What are the
>> pros and cons of this?
> Big con: Depends on auto-props, a mechanism which is not really good
> enough for the task. (Strictly speaking it doesn't, because users can
> set the property manually, but in a practical sense it does because in a
> production environment it is impractical for users to consistently set
> the property manually.)

I find this criticism misleading. It leaves the impression that the
OP's proposal depends exclusively on the merge-mode property to
determine merging behavior. This is not the case. Merge-mode is only
needed when the existing heuristics -- which remain unchanged by the
proposal -- prove insufficient to the task.

To wit:

On Tue, Nov 17, 2009 at 21:56, Mike Samuel <mikesamuel_at_gmail.com> wrote:
> Subversion treats the following files as [[mergable]]:
> * Files with no svn:mime-type [[and no svn:merge-mode]]
> * Files with a svn:mime-type starting "text/"
> * Files with a svn:mime-type equal to "image/x-xbitmap"
> * Files with a svn:mime-type equal to "image/x-xpixmap"
> * [[Files with a svn:merge-mode that is equal to "simple"]]

Furthermore, the need for coordinating autoprops within a team exists
without this proposal, independent of issues of merge behavior. I fail
to see how the OP's proposal will in any way exacerbate that.

I've got about 100 lines of autoprops in the subversion config file I
maintain for my team. This is how we try to assure that most of our
files get reasonable mime-type, needs-lock, eol-style, from the
get-go. Periodically I fix things up with a bash script or two. The
need for this won't be affected by the merge-mode proposal.

// Ben

Received on 2009-11-18 12:31:47 CET

This is an archived mail posted to the Subversion Dev mailing list.