[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: RFC: Changing conditions for 'M'odified status

From: John Szakmeister <john_at_szakmeister.net>
Date: 2006-11-10 09:03:36 CET

----- Erik Huelsmann <ehuels@gmail.com> wrote:
> In the presence of svn:keywords or svn:eol-style, there's currently no
> way to detect changedness of a file but to do a detranslation and
> compare contents of the text file with its text base.
>
> 'svn status' does a full detranslate+compare for files which it has
> reason to believe they might have changed: it checks for changed
> mtimes. When working copies age, more and more files might have an
> mtime which doesn't correspond with the one in the svn admin area
> anymore. Thus in order to find modified files all those files will be
> fully detranslated+compared.
>
> While a changed mtime can only indicate a file might have changed,
> files which differ in size can't contain equal contents (on the
> bit-level). So, we can sometimes shortcut the full detranslate+compare
> cycle by just looking at the size of the file and comparing that to
> the size when we created it. Files which have different sizes must
> have changed.

Makes sense. Actually, I was surprised to see that we don't already make use of that information, especially since the size can be retrieved in the same stat() call. :-(

>
> Especially in working copies with very large files (or many changed
> files), we could get much faster status results by employing this
> algorithm.
>
> But there's one catch: if a file has been modified by changing a
> keyword expansion or CRLF->LF (or vice versa), the context *has*
> changed since it was written, but when detranslated+compared it's
> still equal to the text base.
>
> Now, I think the catch above is an extreme edge case unlikely to
> happen. Also, the gain we get from not resorting to full
> detranslate+compare outweighs the (very slim) chance for false
> positives on 'svn status'.
>
> Peter Lundblad has his reservations. What do others think?
>
> * Does my proposal violate the semantics of 'svn status'?
> * Is having an algorithm with only false negatives (not detecting
> modified files) better than having an algorithm which also has false
> positives (marking files as modified without them actually being
> modified w.r.t. the 'repository normal form') and fewer false
> negatives?

Personally, I had strong reservations when I first thought about this issue. I expect that 'svn status' is going to tell me which files are going to be committed. It affects my log message, and I'd frustrating to have to remove an entry because status lied to me. OTOH, I expect 'svn status' is going to tell which files are going to be committed (i.e., the fact that it has false negatives is troublesome). I guess, the answer really lies in how this affects our typical use-case, which--I believe--is the case where people don't muck with the mtime. In that case, things will work as they always have. Given that, and the fact that you have to work to set the old mtime (or not track it on the file system), and that you'd have to muck with the keyword expansion in order to trigger a false positive, I'd be okay with the change.

I think someone raised a valid point about 3rd parties depending on the output of status, and expecting that anything in status will be committable. That would be the only reservation I have with this change. Unfortunately, we have no idea how this will affect 3rd parties. FWIW, any of the tools I've written would continue to work.

-John

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Nov 10 09:03:54 2006

This is an archived mail posted to the Subversion Dev mailing list.