[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Binary diffs: real-world differencing?

From: Mark Phippard <markp_at_softlanding.com>
Date: 2006-02-03 16:30:53 CET

Daniel Griscom <griscom@suitable.com> wrote on 02/03/2006 10:24:06 AM:

> I'm looking into Subversion, and since I do a lot of multimedia work
> I'm especially interested in its ability to do binary diffs. But I'm
> wondering: has anyone documented how well this reduces
> traffic/storage with different types of real-world files? For
> instance:
>
> - Change a small region of a JPEG file, and a much larger region may
> actually change
>
> - Change a few lines of code, and a compiled executable's internal
> pointers and code locations may completely change
>
> - Change a few pixels of a GIF file with LZW compression may mean
> that different pixel strings are represented by different keys,
> changing the whole file
>
> - Changing a small portion of an MSWord document may (for all I know)
> change the whole file
>
> - Adding or removing a single file from a ZIP archive may (again)
> change the compression keys, thus substantially changing the data.
>
> This leaves me wondering whether every time I change a binary file
> SVN will spend a long time carefully comparing the old and new
> versions, only to throw up its hands and copy the entire new file
> into the repository.
>
>
> So, has anyone documented how well different types of binary files
> are differenced? If not, is there a way I can easily test it myself,
> perhaps with a command-line executable that takes two files and
> outputs a "binary difference" file?

Create an FSFS repository. Import all of these file types into it. Then
check them out and modify and commit them one at a time. Then look at the
size of the revision file created in the repository. That would give you a
good idea. You could also trace the network connection to see how much
data it sends. If you do the testing using the TortoiseSVN client to an
http:// repository it will report how much data is transferred.

Mark

_____________________________________________________________________________
Scanned for SoftLanding Systems, Inc. and SoftLanding Europe Plc by IBM Email Security Management Services powered by MessageLabs.
_____________________________________________________________________________

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Fri Feb 3 16:33:21 2006

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.