[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: problems with mimetype of and empty utf8 files in svn 1.7

From: Tomáš Bihary <bihary.t_at_st-software.com>
Date: Fri, 21 Oct 2011 13:27:13 +0200

Hi,

Yes you are right, SVN 1.6.16 doest it as well, so it's not a 1.7 new issue.

There is just different behaviour for such file in the TortioseSVN Comit
Dialog, so I didn't realized it before
- 1.6 does not display any property status - you have to take a look on
the properties and there is the octet-stream
- 1.7 displays modified property status

Regrads
   Tomas

> On Fri, Oct 21, 2011 at 11:15:19AM +0200, Tomáš Bihary wrote:
>> Hello,
>>
>> after upgrading the svn 1.7 I realized an issue.
>>
>> When I add an empty UTF-8 file, it is added with mimetype
>> application/octet-stream
>> The empty UTF8 file has 3 bytes size - there are just the 3 mark
>> bytes 0xEF 0xBB 0xBF
> The same happened in svn 1.6 though, didn't it?
>
> We could special-case the UTF-8 BOM in the mime-type detection logic.
> See the patch below. Can you try this patch?
>
> Note that the UTF-8 BOM is, as far as I know, only used by Windows.
> So this problem should only happen there.
>
>> If there is an content in that file, it is handled correctly like text.
>>
>> I've found a similar issue 2194 with UTF-16 files which status is REOPENED.
> UTF-16 files are always marked binary at the moment.
> This is because the internal diff and merge functionality does not
> understand UTF-16. As a workaround, you can configure an external
> diff/merge tool that understands UTF-16 files:
> http://svnbook.red-bean.com/nightly/en/svn.advanced.externaldifftools.html
>
>
> Index: subversion/libsvn_subr/io.c
> ===================================================================
> --- subversion/libsvn_subr/io.c (revision 1186983)
> +++ subversion/libsvn_subr/io.c (working copy)
> @@ -2968,6 +2968,13 @@ svn_io_detect_mimetype2(const char **mimetype,
> /* Now close the file. No use keeping it open any more. */
> SVN_ERR(svn_io_file_close(fh, pool));
>
> + if (amt_read == 3&& block[0] == 0xEF&& block[1] == 0xBB&& block[2] == 0xBF)
> + {
> + /* This is an empty UTF-8 file which only contains the UTF-8 BOM.
> + * Treat it as plain text. */
> + return SVN_NO_ERROR;
> + }
> +
> if (svn_io_is_binary_data(block, amt_read))
> *mimetype = generic_binary;
>
Received on 2011-10-21 13:27:36 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.