[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: RFC: Changing conditions for 'M'odified status

From: Erik Huelsmann <ehuels_at_gmail.com>
Date: 2006-11-09 23:04:25 CET

On 11/9/06, SebastianUnger@eaton.com <SebastianUnger@eaton.com> wrote:
>
>
> > -----Original Message-----
> > From: Erik Huelsmann [mailto:ehuels@gmail.com]
> > Sent: Friday, 10 November 2006 10:39
> > To: SVN Dev
> > Subject: RFC: Changing conditions for 'M'odified status
> >
> >
> > In the presence of svn:keywords or svn:eol-style, there's currently no
> > way to detect changedness of a file but to do a detranslation and
> > compare contents of the text file with its text base.
> >
> > 'svn status' does a full detranslate+compare for files which it has
> > reason to believe they might have changed: it checks for changed
> > mtimes. When working copies age, more and more files might have an
> > mtime which doesn't correspond with the one in the svn admin area
> > anymore. Thus in order to find modified files all those files will be
> > fully detranslated+compared.
> I was under the presumption, that when svn status does a full comparison
> and finds a file to be equal to the base, it resets the admin area mtime
> to the current mtime of the file. If that is (or would be) true, the age
> of the working copy would have much less impact on the performance of
> status. So if this is not done at the moment I'd definitely propose that
> that is done (update mtime) either alternatively or in addition to the
> text-size approach.

This 'time repairing' only happens when you run 'svn cleanup' which -
given the nature of the command - most people don't use on non-broken
working copies. Maybe it happens with other write-requiring operations
too, but definitely not with 'svn status' which is a read only
operation (and no, we can't make it upgrade its lock to write: 2 'svn
status' commands may stumble over each other)...

> > While a changed mtime can only indicate a file might have changed,
> > files which differ in size can't contain equal contents (on the
> > bit-level). So, we can sometimes shortcut the full detranslate+compare
> > cycle by just looking at the size of the file and comparing that to
> > the size when we created it. Files which have different sizes must
> > have changed.
> Agreed.

> > But there's one catch: if a file has been modified by changing a
> > keyword expansion or CRLF->LF (or vice versa), the context *has*
> > changed since it was written, but when detranslated+compared it's
> > still equal to the text base.
> >
> > Now, I think the catch above is an extreme edge case unlikely to
> > happen. Also, the gain we get from not resorting to full
> > detranslate+compare outweighs the (very slim) chance for false
> > positives on 'svn status'.
> I am not entirely sure that it is an edge case when you (as I regularly
> do) start copying files between working copies of potentially different
> branches etc. Also we have XML files which are updated automatically
> using XSLT translations that use the text-base (stored in .svn) as their
> input.

Heh, well, I do understand your point, and you'd do it another way if
there were no .svn directory, but you really shouldn't be accessing
the content of the .svn directory at all. Just a warning...

> Those translations quite regularly de-translate the $Id$'s in
> those files given that they aren't expanded in the text-base. However,
> also most of the time they don't actually result in any other changes
> to the files.
>
> So I conclude that the assumption above depends strongly on the ways
> svn is used in a particular setup and of the algorithm above is added
> I would at least like to see and option to disable that optimisation.

Hmm. And you wonder how files come to be without mtime changes yet do
have content changes? :-)

> > Peter Lundblad has his reservations. What do others think?
> >
> > * Does my proposal violate the semantics of 'svn status'?
> > * Is having an algorithm with only false negatives (not detecting
> > modified files) better than having an algorithm which also has false
> > positives (marking files as modified without them actually being
> > modified w.r.t. the 'repository normal form') and fewer false
> > negatives?
> Making it optional gives users the choice and would defuse that problem
> in my eyes.

Adding options is a programmers way to take no decisions. I'd rather
not resort to that :-)

bye,

Erik.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Thu Nov 9 23:05:13 2006

This is an archived mail posted to the Subversion Dev mailing list.