[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: binary detection algorithm in SVN

From: Ben Collins-Sussman <sussman_at_collab.net>
Date: 2005-06-16 14:52:36 CEST

On Jun 16, 2005, at 7:31 AM, Cagatay Catal wrote:

> Hello,
>
> I am examining features of SVN and reading some notes about it.
>
> I have read “Christophe Dupré” ‘s power point which is called
> “Source Code Revision Control with Subversion”.
>
> In this this presentation it is stated that binary detection
> algorithm sometime fails. Something must be set manually.
>
>
>
> Why is not that property set default? Or did I missunderstand sth
> about this?

When you 'svn add' or 'svn import', svn has a heuristic that tries to
guess if a file is binary. It examines the first N bytes of a file,
and looks for non-ascii characters or NULL bytes. If a certain
percentage of them look this way, then the 'svn:mime-type' property
is automatically attached to the file with a value of "application/
octet-stream'. This prevents the svn client from attempting to do
contextual diffs and merges in the future.

But there's no way this algorithm can be perfect, it's just educated
guessing. The slide is saying: in the case of PDF files, sometimes
the algorithm doesn't think the file is binary. Yet PDF files are
definitely not line-based text-files that can be contextually diffed
and merged, so humans need to intervene and set that property manually.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Thu Jun 16 14:58:16 2005

This is an archived mail posted to the Subversion Users mailing list.