"Jonathan S. Shapiro" <email@example.com> writes:
> I am puzzled by something -- perhaps naively.
> In abstract, I understand that doing everything in stream-oriented fashion
> can significantly reduce memory requirements. On the other hand, in-memory
> designs are a hell of a lot simpler and modern swap areas are decently
> large. I'm wondering if there are other reasons to do this that I am missing
> that make the streaming design compelling?
We know of at least one tree, planned to be imported into Subversion,
that's 200 megs on checkout. This is large enough to make streaming
> In particular, I've been working on the text merge problem. The usual
> solution operates by taking two pairwise deltas and then merging the deltas.
> If S and T are the two strings in the pairwise delta, then the classical
> algorithm requires O(|S| x |T|) space and O(|S| x |T|) time. Using a divide
> and conquer strategy, one can reduce the space requirement to O(|S| + |T|).
> However, that O(|S| + |T|) seems to be a pretty hard lower bound. The
> difficulty is that building the delta matrix requires that you compare the
> characters at corresponding locations, and to do this you need to have both
> files accessable.
The generation of text deltas happens any way you want; streaminess is
for receiving deltas (tree or text).
Received on Sat Oct 21 14:36:06 2006