[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Why streams?

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2000-08-16 16:12:37 CEST

No matter what your memory capacity is, you will always be able to
outstrip it with on-disk storage. Streaming is definitely a laudable
goal.

> However, that O(|S| + |T|) seems to be a pretty hard lower bound.
[for diffs]

This is why binary diffing algorithms use windows (and why "diff" uses
whole lines of text, to reduce the number of comparison points). The
stupidest kind of windowing is: you diff the first N bytes of S with
the first N bytes of T, then the next N bytes of each file, and so on.
Your storage requirements are then O(windowsize). By sliding the
source window, you can do somewhat better.

Memory-mapping also works, but it may not perform very well if the
delta tells you to wander all over the source file, and it certainly
isn't going to work for data kept in a Berkely db database.
Received on Sat Oct 21 14:36:06 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.