[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: RFC: Changing conditions for 'M'odified status

From: <SebastianUnger_at_eaton.com>
Date: 2006-11-09 22:54:21 CET

> -----Original Message-----
> From: Erik Huelsmann [mailto:ehuels@gmail.com]
> Sent: Friday, 10 November 2006 10:39
> To: SVN Dev
> Subject: RFC: Changing conditions for 'M'odified status
> In the presence of svn:keywords or svn:eol-style, there's currently no
> way to detect changedness of a file but to do a detranslation and
> compare contents of the text file with its text base.
> 'svn status' does a full detranslate+compare for files which it has
> reason to believe they might have changed: it checks for changed
> mtimes. When working copies age, more and more files might have an
> mtime which doesn't correspond with the one in the svn admin area
> anymore. Thus in order to find modified files all those files will be
> fully detranslated+compared.
I was under the presumption, that when svn status does a full comparison
and finds a file to be equal to the base, it resets the admin area mtime
to the current mtime of the file. If that is (or would be) true, the age
of the working copy would have much less impact on the performance of
status. So if this is not done at the moment I'd definitely propose that
that is done (update mtime) either alternatively or in addition to the
text-size approach.

> While a changed mtime can only indicate a file might have changed,
> files which differ in size can't contain equal contents (on the
> bit-level). So, we can sometimes shortcut the full detranslate+compare
> cycle by just looking at the size of the file and comparing that to
> the size when we created it. Files which have different sizes must
> have changed.

> But there's one catch: if a file has been modified by changing a
> keyword expansion or CRLF->LF (or vice versa), the context *has*
> changed since it was written, but when detranslated+compared it's
> still equal to the text base.
> Now, I think the catch above is an extreme edge case unlikely to
> happen. Also, the gain we get from not resorting to full
> detranslate+compare outweighs the (very slim) chance for false
> positives on 'svn status'.
I am not entirely sure that it is an edge case when you (as I regularly
do) start copying files between working copies of potentially different
branches etc. Also we have XML files which are updated automatically
using XSLT translations that use the text-base (stored in .svn) as their
input. Those translations quite regularly de-translate the $Id$'s in
those files given that they aren't expanded in the text-base. However,
also most of the time they don't actually result in any other changes
to the files.

So I conclude that the assumption above depends strongly on the ways
svn is used in a particular setup and of the algorithm above is added
I would at least like to see and option to disable that optimisation.

> Peter Lundblad has his reservations. What do others think?
> * Does my proposal violate the semantics of 'svn status'?
> * Is having an algorithm with only false negatives (not detecting
> modified files) better than having an algorithm which also has false
> positives (marking files as modified without them actually being
> modified w.r.t. the 'repository normal form') and fewer false
> negatives?
Making it optional gives users the choice and would defuse that problem
in my eyes.

Just my two cents...
Received on Thu Nov 9 22:54:46 2006

This is an archived mail posted to the Subversion Dev mailing list.