[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Delta performance

From: Greg Hudson <ghudson_at_mit.edu>
Date: 2000-10-07 01:52:04 CEST

Well, I have a little bit of data on differencing performance using my
proposed output format. I compared the gcc 2.95 and gcc 2.95.2
sources against each other (just the .c files in the gcc subdir), and
found:

                                        Bytes
        File-by-file, ours: 15828
        File-by-file, diff+gzip: 9989
        Concatenated, ours: 15206
        Concatenated, diff+gzip: 6015

That wasn't too encouraging, so I decided to try some binary data. I
tried the .elc files which were present in both emacs 19.34b and emacs
20.7.

                                        Bytes
        File-by-file, ours: 2630361
        File-by-file, diff+gzip: 2246212
        Concatenated, ours: 4457134
        Concatenated, diff+gzip: 2029833

That wasn't too encouraging either. I'd like to know whether the
problem is with our vdelta code, our window size, or my output format.
Unfortunately, Branko does not seem to have had time to debug his
generator (it dumps core when you use it, and after I fixed the first
bug the next bug looked difficult to fix), so I can't eliminate the
output format as a variable. I'm going to write to Phong Vo and ask
whether they ever released that library he mentioned the last time I
talked to him; maybe that will turn up a good source of comparative
data and ideas.

(And yes, I reenabled the call to vdelta before running these
tests. :) Otherwise it would take a lot more than 15828 bytes to
describe how to reconstruct the gcc sources.)
Received on Sat Oct 21 14:36:10 2006

This is an archived mail posted to the Subversion Dev mailing list.