[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Merge mode (Was Classifying files as binary or text)

From: Julian Foad <julianfoad_at_btopenworld.com>
Date: Wed, 18 Nov 2009 10:55:11 +0000

Julian Foad wrote:
> I need to spend some time replying, late at night though it is.
>
> Let me try to explain why I think a "how to merge" property should not
> be the primary indicator of how subversion should merge each file.

Some comments on my own posting...

> Principle
> =========
>
> I have read that, in the realm of data handling, there is a principle
> that it is a bad idea to tag data with annotations that say what kind of
> actions can or should be performed on it. That kind of coupling is
> unscalable. Instead, it is better to tag data with an indication of what
> meaning and/or what syntax the data has, and then let tools decide what
> to do, based on that information.
>
> We already have one data-type indicator: svn:mime-type. Now, MIME type
> is far from a complete data type specifier. It is insufficient for our
> needs, in theory. However, in practice, it is nearly sufficient. (See
> Problem 2 below for an exception.)
>
> We also have another data-type indicator: the file name. A file name is
> also an incomplete source of metadata, and some file names ("README" or
> "CHANGES") give no indication at all of the format, but it is useful in
> many cases ("*.py", "*.c").

We also potentially have a third indicator that can be useful: a scan of
the content to determine whether it is mostly plain text (implemented by
the mis-named svn_io_detect_mimetype()).

> Problem 1 (limited recognition of MIME types)
> =========
>
> Subversion mis-categorizes a lot of MIME types as "binary" (and
> therefore will not merge or diff or blame them) which really are
> line-based text formats.
>
> The list of such MIME types is continually evolving so it is not
> possible for Subversion to have a built-in complete list. However, it is
> easy for new releases of Subversion to have an updated list.

I wonder if I have missed some good reason not to update the built-in
list of MIME types that are considered "text". Just a constraint on the
available volunteer effort? I know it's not the 100% solution but it is
a 99% solution and UI am surprised we are not automatically starting
here.

> It is not much harder for Subversion to have a configuarable list of
> which MIME types (or MIME type patterns) should be considered mergeable.
> (The configuration could be extensible: it could say line-wise-mergeable
> or not mergeable or XML-mergeable or ...)

[...]

> Solution 0 (merge-mode)
> ==========
>
> So we could add a property to each file which says whether the file is
> to be considered line-wise mergeable by Subversion, and say that this
> property will be the primary source of this information. What are the
> pros and cons of this?

Big con: Depends on auto-props, a mechanism which is not really good
enough for the task. (Strictly speaking it doesn't, because users can
set the property manually, but in a practical sense it does because in a
production environment it is impractical for users to consistently set
the property manually.)

- Julian

------------------------------------------------------
http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2419449
Received on 2009-11-18 11:55:34 CET

This is an archived mail posted to the Subversion Dev mailing list.