On Tue, 2005-10-18 at 12:00 -0400, Cyrus Jones wrote:
> Thanks for the response on my earlier question. The reason I asked is
> that I ran a test with a large compressed fil. I checked this file in on
> my development machine and the I then checked out the file on a second
> machine. I then modified the file and made several random changes
> throughout the file and checked that in. When I updated the file on the
> second machine it appears that SVN simply pulled the entire file down
> instead of just the changes. Is there some kind of threshold where if a
> file is too "noisy" that SVN simply resorts to pulling down a new copy?
Nope. But depending on the compression algorithm, stream alignment and
other issues may prevent the delta from being much smaller than the
entire file. zlib had this issue. There is actually a patch flotaing
around (and is in the rsync contrib dir) that makes zlib rsync friendly,
which actually makes it delta friendly in general.
If you want a technical answer, if you just remove a chunk 33 bytes long
and it then just shifted everything else 33 bytes to the left to fill
the gap, everything will be misaligned such that the delta algorithm
won't pick up that the bytes are the same again until you hit the x % 33
= 0'th byte, where x is a multiple of the points its checksumming to
look for matches.
If this happens quite often, it's possible the delta algorithm never
finds matches.
A good estimate of how well the subversion algorithm will do is to use
the xdelta command line application to produce a delta between the
files.
There are also various stream alignment techniques to try to discover
better starting points than just stabbing checksums every x bytes, but
we don't use them. (mainly due to nobody implementing them).
HTH,
Dan
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Wed Oct 19 01:27:34 2005