[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: File modification detection?

From: Mike Mason <mgm_at_thoughtworks.net>
Date: 2004-04-22 21:59:43 CEST

Ben Collins-Sussman wrote:

>On Thu, 2004-04-22 at 14:28, Mike Mason wrote:
>>If Subversion is storing MD5s for the text bases anyhow, shouldn't the
>>comparison use the MD5 instead of being byte-for-byte?
>Hm, but that would force us to always read the *entire* text-base file.
>At the moment, we bail out as soon as we see a difference between the
>working and text-base files.
>My first thought was: the current algorithm is faster than
>checksumming, because we can bail early. Most changes aren't at the
>very end of a file.
>But then Karl pointed out that while "on average" our current algorithm
>bails after reading half the text-base, this is cancelled out by the
>fact that it's reading two files instead of one. So maybe the
>byte-for-byte and checksum strategies come out even? :-)

Well, I figure most of the time my working files are going to be cached
by the operating system[1] because I've been working on them. CPUs are
pretty fast so it's disk IO we're worried about here -- does that make a
difference? As someone already pointed out I guess this isn't really
that important unless you're storing big files that change content but
not size (like, er, maybe a disk image).


[1] For useful values of "operating system" -- Windows XP seems to like
having 300 megs of free ram on my 1 gig machine and then endlessly
grinding the disk for me.

To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Apr 22 22:00:06 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.