> How does the tree structure of an in-memory delta help anyone if you
> never have a whole delta in memory at once? Even if you can stream a
I am puzzled by something -- perhaps naively.
In abstract, I understand that doing everything in stream-oriented fashion
can significantly reduce memory requirements. On the other hand, in-memory
designs are a hell of a lot simpler and modern swap areas are decently
large. I'm wondering if there are other reasons to do this that I am missing
that make the streaming design compelling?
In particular, I've been working on the text merge problem. The usual
solution operates by taking two pairwise deltas and then merging the deltas.
If S and T are the two strings in the pairwise delta, then the classical
algorithm requires O(|S| x |T|) space and O(|S| x |T|) time. Using a divide
and conquer strategy, one can reduce the space requirement to O(|S| + |T|).
However, that O(|S| + |T|) seems to be a pretty hard lower bound. The
difficulty is that building the delta matrix requires that you compare the
characters at corresponding locations, and to do this you need to have both
You could certainly accomplish this by streaming the files through memory
multiple times, but it seems to me that memory mapping the files is a better
solution. In either case you need the full content because you need to be
able to rewind the file, so no space is reduced by the streaming solution.
Is there some other rationale for streaming that I am failing to notice?
Received on Sat Oct 21 14:36:06 2006