[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Ascii/binary detection.

From: Jim Blandy <jimb_at_zwingli.cygnus.com>
Date: 2001-08-04 06:23:40 CEST

kfogel@collab.net writes:
> > >1. Develop a heuristic for determining the binariness of a file, say
> > > svn_io_is_binary_file ()
> > >
> > (Two suggestions: a) don't mark the file as binary just because there's
> > a byte with value >= 128 in it; b) if other tests aren't conclusive,
> > check for extremely long lines in the file?)
>
> That combination is exactly what we were planning to do, yeah -- some
> combination of a) at least a certain percentage of bytes with the high
> bit set, and b) long lines.

Since the whole point of this is to save the user the trouble of
explicitly specifying the type of each file, it seems to me that
binary detection is a great job for a client-side plugin. Users
generally know what kind of files are typically found in their
filesystems. If someone has a lot of UTF-8 encoded Chinese text, then
she'll want a different heuristic than what I'd want.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 21 14:36:35 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.