[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: bug in svn diff and related?

From: Travis <svn_at_castle.fastmail.fm>
Date: 2005-03-16 05:15:22 CET

On Mar 15, 2005, at 9:46 PM, Ben Collins-Sussman wrote:

> On Mar 15, 2005, at 3:23 PM, Travis P wrote:
>
>> On Mar 15, 2005, at 8:31 AM, Ben Collins-Sussman wrote:
>>
>>> The algorithm is extremely similar to what CVS does:
>>>
>>> stat working and textbase files.
>>> if (mtimes of working and textbase are equal):
>>> return NOT_CHANGED;
>>> else if (filesizes of working and textbase are unequal):
>>> return CHANGED;
>>> else
>>> compare the files byte-by-byte. /* very slow */
...
>> if (filesizes of working and textbase are unequal):
>> return CHANGED;
>> else if (mtimes of working and textbase are equal):
>> return NOT_CHANGED;
>> else
>> compare the files byte-by-byte. /* very slow */
>>
>
> Why is this less likely to return a false answer?
>
> And also, I'd argue this is slower. 99% of the time, almost every
> file in the tree will be unchanged, and will have identical
> working/textbase timestamps. Using the current algorithm, it means
> that 99% of the time we get a definitive "answer" to the question on
> the first comparison. In your algorithm, we'd end up doing two
> comparsions nearly all the time, instead of one.

Checking the meta-data as a proxy for the content is a heuristic.
People's processes mess with timestamps for various reasons. With the
comparisons reversed as I suggested, users have to preserve two pieces
of this meta-information to prevent the algorithm from returning
NOT_CHANGED when the file has actually changed. This is less likely to
happen as an unintended consequence.

True: I am considering the cost of the extra integer comparison and
branch to be negligible. If you are decreasing the accuracy of the
heuristic because of that cost, it seems like an unworthwhile
micro-optimization to me. Do you really think the cost of the
comparison and branch are not negligible on the order of tens or maybe
hundreds of thousands of files in a wc?

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Wed Mar 16 05:18:07 2005

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.