[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Search subversion binary content

From: Daniel L. Rall <dlr_at_finemaltcoding.com>
Date: 2005-10-07 23:22:05 CEST

On Fri, 07 Oct 2005, Ted Shab wrote:

> Hello,
>
> Is there a best practice for search subversion binary
> content?
 
By "binary content", I'm going to assume that you literally mean searching
for any binary string of data (as opposed to textual).

Do many engines out there generate useful indices without tokenization
patterns? Given binary content, are there any tokens generate an index from,
a la natural language words or characters, or patterns (in images, music,
etc.) which would work? If so, you might want to generate an index of
the repository (via post-commit hook, periodic background processesing, or
both).

If not, you could use a primitive solution like grep'ing a checkout of a
specific rev of the repository, or a more hands-on approach like a crawler
which searched on-demand, walking the repository based on a specified tree
and revision range, re-assembling each revision and searching it.

> What tools have people had experience using in this
> manner?

I haven't heard of anyone doing binary searching (as described above).

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Oct 7 23:23:06 2005

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.