[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Classifying files as binary or text

From: Mike Samuel <mikesamuel_at_gmail.com>
Date: Thu, 12 Nov 2009 17:35:46 -0800

2009/11/12 Branko Cibej <brane_at_xbc.nu>:
> Mike Samuel wrote:
>> 2009/11/12 Mark Phippard <markphip_at_gmail.com>:
>>> On Thu, Nov 12, 2009 at 6:20 PM, Mike Samuel <mikesamuel_at_gmail.com> wrote:
>>>> Conclusions from the svn:charset thread that Mark pointed out:
>>>> (1) This proposal should not gate on svn:charset since it isn't yet
>>>> recognized as official
>>>> (2) We should avoid the term encoding in documentation of this feature.
>>>> (3) There may be some bad interactions between ";charset=" in
>>>> svn:mime-type and auto-props, but this proposal does not raise new
>>>> issues, and those issues are a result of an error (possibly since
>>>> fixed?) in auto-props.
>>>> From the svn:charset thread:
>>>> Much of the early debate deals with svn:charset being non-standard and
>>>> non-approved.  I tend to agree with Stefan, that this proposal
>>>> shouldn't gate on svn:charset being approved so I suggest tabling
>>>> variant 1.
>>> Correct me if I am wrong, but the only real goal we have right now is
>>> to improve SVN's ability to tell itself "this is text" and I can do
>>> textual merging?
>> That is correct.
>>> So why not just add an svn:text property that has a
>>> value of '*'.  The presence of the property means "treat this as
>>> text".
>> To make sure I understand your counter-proposal, would a file be
>> treated as text if at least one of (svn:mime-type starts with "text/"
>> or matches the existing whitelist) OR (svn:text exists and is "*")?
>> Or are you advocating dropping the first clause which is there for
>> backwards-compatibility?
> I think we all agree that using the MIME-type to decide whether we can
> use contextual text-base merge for a file has turned out to be trickier
> than we originally expected. It makes sense to find a better solution to
> the problem.

I'm afraid that I'm unfamiliar with these discussions since I just
joined this list and have never submitted a patch to SVN before.
Can you explain why my argument that mime-types do specify "textiness"
as described in RFC 2046 is flawed or point me at threads that discuss
why the trickiness?

> I would suggest just slightly future-proofing Mark's proposal, and
> certainly staying backward-compatible. The rules should go like this:
>    * If a file has no content-related property (i.e, no svn:mime-type),
>      treat it as text.
>    * If there's only an svn:mime-type property, keep its current semantics.
>    * Introduce a new property that overrides svn:mime-type, but don't
>      call it svn:text (which implies it's a boolean), but, e.g., just
>      svn:type or svn:merge-mode or some such.

> The value of this property is a keyword. Initialliy, the allowed values
> would be "text" and "binary", but I can imagine adding "xml" in the
> future if someone wants to add XML-aware merging (I've come across such
> requests everal times, have used a similar feature in ClearCase to good
> effect). svn:mime-type needn't go away or be deprecated; It's useful in
> other contexts.

That sounds like a fine idea, but a (how-to-merge?) property seems
only tangentially related to (is-text?). This is exactly the kind of
overloading that I think should be avoided. Yes, merging and diffing
are definitely related concerns, but doing type-sensitive merging is a
big change and I think that it would be good not to introduce new
properties to enable it until there's significant agreement on goals
and how to stage rollout.

> -- Brane
> ------------------------------------------------------
> http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=2417343

Received on 2009-11-13 02:35:58 CET

This is an archived mail posted to the Subversion Dev mailing list.