On Tue, 2005-04-26 at 17:51 -0500, kfogel@collab.net wrote:
> Marcus Rueckert <darix@web.de> writes:
> > According to Ben (sussman) the current change detection does the
> > following steps:
> >
> > 1. load entries file into memory
> > 2. stats the file
> > 3. if the timestamps matches -> returns NOT_CHANGED
> > 4. if the timestamps differ it stats the text base.
> > 5. if the size of text base and file differ -> returns CHANGED
> > 6. if the sizes match it does a byte-by-byte comparison.
> >
> > I think step 6 can be optimized a bit.
> > The entries file has the md5sum of the text-base stored.
> > Why dont we just read the working file and md5sum the content.
> > This way we only need to read 1 file into memory (the working file) and
> > the md5sum algorithm might be faster than the diff algorithm.
> >
> > any comments?
>
> To calculate an MD5 sum, you must read every byte in the file.
Actually, you could do the md5sum in rolling fashion and stop when the
two don't match.
:)
But this is no faster than comparing bytes (in part because MD5 is not
magic)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Wed Apr 27 16:48:59 2005