[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Towards standardising mime-type support

From: Branko Čibej <brane_at_xbc.nu>
Date: 2005-04-01 11:47:07 CEST

Trent Apted wrote:

> Thanks for your reply.
> Branko Čibej wrote:
>> Trent Apted wrote:
>>> When I run
>>> $ file -bi something.c
>>> (or .cpp, .h, .cc, etc.)
>>> /usr/bin/file reports that the mime-type for C/C++ files is
>>> text/x-c; charset=us-ascii
>> Well, this is clearly wrong, there's no such thing as a "text/x-c"
>> mime type.
> Perhaps true. RFC2046 *only* defines the 'plain' subtype of the text
> mime type, but we all use text/html. It also says any unrecognised
> mime types should just be treated as text/plain, so perhaps whether or
> not it is valid is moot.

RFC2046 isn't the canonical reference. This
(http://www.iana.org/assignments/media-types/) is the canonical reference.

>>> However, if I feed this to Subversion, it treats the file as binary.
>>> So, that's fine, I'll drop the charset stuff,
>> Ah, yes, We should know about charset attributes.
>>> and things are mostly back to normal, but the added information
>>> still appears to be meaningless to Subversion.
>> Define "meaningless". You've told SVN that this is a text file, and
>> that's it. SVN doesn't interpret the mime type any further than that
>> (yet).
> I guess I'm saying that text/x-c and text/x-cc would imply that a file
> is source code, and hence platform independent, thus should always use
> the 'native' eol-style and should never be executable. While you
> should still specify the style and executability for something that is
> text/plain. However, this might not suit everyone...

Media types do not define the encoding, only the type of the contents.
Therefore we a) can't extrapolate eol-style from the mime type, and b)
would be totally wrong to do so because there are valid reasons _not_ to
use native eol-style even in mixed-platform environments.

>>> Further, perhaps there should be a feature with support for
>>> `/usr/bin/file -bi` --- "auto-auto-props" might be nice.
>> We've talked about using platform-specific mechanisms to guess the
>> mime type. What's missing is somebody with enough time on their hands
>> to actually do this.
> /usr/bin/file uses a tab-separated file of the form:
> $ cat /usr/share/misc/file/magic.mime
> # Magic data for KMimeMagic (originally for file(1) command)
> #
> # The format is 4-5 columns:
> # Column #1: byte number to begin checking from, ">" indicates
> continuation
> # Column #2: type of data to match
> # Column #3: contents of data to match
> # Column #4: MIME type of result
> # Column #5: MIME encoding of result (optional)
> #------------------------------------------------------------------------------
> This would not be platform-specific.


Oh and surely it would not be platform specific at all, aye. It would
work on *every* Linux box in the world (well, except some older ones).

> Actually, looking further it appears that the heuristic for c/c++
> files is that if a file starts with "/*" it is C, and if with "//" it
> is C++. `0xbabe` is a Java class file, etc.
> I can write a patch, if you like.. PhD students tend to find time for
> all kinds of non-thesis pursuits ;-)

Any change in this direction must be generic in the sense that it allows
platform-specfic implementations, and it must produce portable results.
I will veto any patch that produces MIME types that are not in the IANA

-- Brane

To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Fri Apr 1 11:49:43 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.