Re: Text mime types

From: Vincent Lefevre <vincent+svn_at_vinc17.org>
Date: 2005-06-14 18:50:07 CEST

On 2005-06-14 15:21:47 +0100, Julian Foad wrote:
> Vincent Lefevre wrote:
> >http://httpd.apache.org/docs-2.0/en/mod/mod_mime.html#contentencoding
>
> OK, thanks. That helps to explain how content type is handled in HTML.

Concerning Apache, this would be HTTP more than HTML. But of course,
other protocols, file systems (with metadata) or utilities (e.g.
"file -zi" under Linux) could reuse the same ideas.

> OK. So you're saying we should move closer to the way content type is
> notified in HTML*, and add a "content encoding" to the meta-data of
> Subversion files, as the first indicator of how to handle a file. If no
> encoding is specified for a file, then the Subversion client program would
> look at the MIME type to determine how to handle it. If an encoding is
> specified, then we could design Subversion to decode the file before
> applying operations such as "diff" and "merge" (and it would look at the
> MIME type to determine what to do after decoding), and encode it
> afterwards.

For diff, there would be no encode step as this wouldn't make much
sense. However the diff behavior after the possible decode step could
depend on the MIME type (through an option). For instance, this would
allow "true" XML diff (hasn't this been requested before?).

For merge, it would depend on the content encoding in the working copy.
Do I need to detail?

> The user could be given the option of not having this decode/encode
> step performed.

Yes.

> You must be implying that we should add this "content encoding" field,
> because without it there is no point in knowing what MIME type the data
> would have after decoding. I must have missed where you said this.

Yes, this kind of thing. This is basically why the "content encoding"
notion is used with HTTP.

I don't know about mail. The application/octet-stream MIME type is
generally (always?) used for gzipped files, and I can say that this
is really annoying when I want to view gzip-compressed files from my
MUA; of course, I could use a handler for application/octet-stream
attachments and guess the real MIME type... just like if MIME types
never existed. So, the "content encoding" system would be a real
benefit.

> This may be a direction that we want to go in. I don't know. I was
> working on the assumption that we had just one field describing the
> file's outermost type, and that therefore that field would say that
> a file was gzipped data, but would not say what kind of data had
> been gzipped.
>
> I can't help feeling that HTML's two-level scheme (content-type and
> content-encoding) lacks generality: for instance it can't handle
> more than one encoding such as a text file that is gzipped and then
> uuencoded.

But do you really store uuencoded gzipped file in your Subversion
repository?

Anyway this isn't a problem since content codings can be chained.
See <http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.11>.

Content-Encoding = "Content-Encoding" ":" 1#content-coding

Note: 1# means 1 or more, separated by commas. It also says:

If multiple encodings have been applied to an entity, the content
codings MUST be listed in the order in which they were applied.

Note that uuencode isn't defined, since it would never be used as
a content encoding anyway (perhaps just a transfer encoding). See
"3.5 Content Codings" vs "3.6 Transfer Codings" in RFC 2616.

> In practice this probably isn't much of a problem, but it still
> bothers me. I have for years thought about designing a heirarchichal
> content-type description scheme, but haven't got very far.

I think that all is already in HTTP/1.1.

-- 
Vincent Lefèvre <vincent_at_vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / SPACES project at LORIA
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Received on Tue Jun 14 18:51:10 2005

This message: [ Message body ]
Next message: kfogel_at_collab.net: "Re: [PATCH] FAQ update on binary files (was Re: Determining which files are treated as binary)"
Previous message: Daniel Rall: "Re: [PATCH] doc/README typo"
In reply to: Julian Foad: "Re: Text mime types"
Next in thread: Greg Thomas: "Re: Text mime types"

Contemporary messages sorted: [ By Date ] [ By Thread ] [ By Subject ] [ By Author ] [ By messages with attachments ]