[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Blue-sky idea: Representation reuse

From: mark benedetto king <bking_at_inquira.com>
Date: 2002-10-12 15:11:52 CEST

On Wed, Oct 09, 2002 at 10:22:57PM -0500, Jim Blandy wrote:
>
> You should look at the papers surrounding rsync. They had stuff in
> there for detecting plagarism in a corpus of student work; that
> definitely involved recognizing partial matches in a large corpus of
> files. This has definitely been done.
>

Yes, it has been done, but AFAIK, there are the "sound" O(n^2) approaches,
and the "practical" O(n log n) approaches.

What if the repository were stored on a compressed filesystem?
This could (IMO) give us similar total storage savings as
a shared-representation model, without any additional code or
testing.

--ben

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sat Oct 12 15:19:48 2002

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.