> I've been worrying about something else, though. vdelta expects to
> hash the whole source before starting to generate the delta. So where
> does windowing come into that? Do you take parallel sections of
> both source and target? Oh ... that would be O.K. for small differences,
> and for large unrelated files you'd just get "noisy" compression, right?
According to the VCDIFF draft:
However, even with a memory-efficient and fast string matching
algorithm, the computing resource requirements can be prohibitive
for processing large files. The standard way to deal with this is
to partition input files into "windows" to be processed
separately. Here, except for some unpublished work by Vo, little
has been done on designing effective windowing schemes. Current
techniques, including Vdelta, simply use windows of the same size
with corresponding addresses across source and target files.
String matching and windowing algorithms have large influence on
the compression rate of delta and compressed files.
So, you've got your work cut out for you. :)
Received on Sat Oct 21 14:36:06 2006