On 3/24/07, Vincent Lefevre <vincent+svn@vinc17.org> wrote:
> On 2007-03-21 21:49:32 +0100, Erik Huelsmann wrote:
> > Encoding inode numbers in a working copy doesn't work exactly great,
> > since working copies are allowed to move from one system to another.
> > When doing so, this would invalidate the entire working copy inode
> > cache.
>
> You don't move working copies every day, do you? So, rebuilding the
> cache shouldn't be much a hassle.
>
> BTW, it is not true that working copies are allowed to move from one
> system to another: if the locales change, filenames with non-ASCII
> characters will break.
>
> > Ben Reser tells me ctime updating can be turned off on many
> > filesystems and many admins seem to do so.
>
> Well, I don't know any such admin. But if one uses max(mtime,ctime),
> it will be equivalent to the current solution on such systems, so
> that this solution won't hurt.
>
> > Next to that, it doesn't exist on Windows,
>
> A Windows OS could be detected if need be (I assume such kind of
> things are done for other features, e.g. symbolic links).
>
> > nor does it exist in some (many?) Mac filesystems UFS and HFS don't
> > know the concept.
>
> Perhaps the inode solution could work. Now, the use of ctime (or inode)
> could be optional, so that every user would be happy.
>
> > In order to create a ctime value in Windows, the APR developers took
> > the 'Creation time' to fill the slot. This means the value won't
> > change - ever - after file creation.
>
> But max(mtime,ctime) would work like mtime in such a case.
>
> > So, to be short, the system you are proposing works on some OSes, but
> > many (probably the majority) of our users won't benefit from a change
> > like that.
>
> They won't benefit, but if they can't see any change while other users
> will see an improvement, this would be better.
>
> > Not to be mean, but did you actually ever run into this problem
> > yourself, or are we just arguing for the sake of argument? Because I'm
> > still not denying it *may* happen, but chances are too slim to meet
> > the problem more than once in your lifetime, unless you have a real
> > use-case.
>
> Not that slim, in fact.
>
> prunille% echo "17 \\u20ac" > euro
> prunille% stat euro
> File: `euro'
> Size: 7 Blocks: 8 IO Block: 4096 regular file
> Device: e000002h/234881026d Inode: 19496211 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 501/ vinc17) Gid: ( 501/ vinc17)
> Access: 2007-03-22 00:36:10.000000000 +0100
> Modify: 2007-03-24 01:36:02.000000000 +0100
> Change: 2007-03-24 01:36:02.000000000 +0100
> prunille% cat euro
> 17 €
> prunille% recode UTF-8..ISO-8859-1 euro
> prunille% stat euro
> File: `euro'
> Size: 7 Blocks: 8 IO Block: 4096 regular file
> Device: e000002h/234881026d Inode: 19559326 Links: 1
> Access: (0644/-rw-r--r--) Uid: ( 501/ vinc17) Gid: ( 501/ vinc17)
> Access: 2007-03-24 01:36:25.000000000 +0100
> Modify: 2007-03-24 01:36:02.000000000 +0100
> Change: 2007-03-24 01:36:32.000000000 +0100
> prunille% cat euro
> 17 EUR
> And also the potential problems with mv. It is too easy to create
> files that have the same timestamp and the same length. An example
> in my working copy:
>
> -rw-r--r-- 1 vlefevre vlefevre 1030 2006-10-03 23:14:27 b00034.xml
> -rw-r--r-- 1 vlefevre vlefevre 1030 2006-10-03 23:14:27 b00218.xml
>
> and I have at least 3 other such couples. Note that when replacing
> a file by another one, it generally means that such files have some
> relation, and an identical timestamp and/or size may be a cause of
> such a relation.
>
> Moreover, remember that if the problem occurs, it can be quite
> serious, as it can lead to data loss.
Right, but with your recode example, no data is lost, since the file
already was under version control. The only event which got lost is
the trivial action of the encoding transformation. Also, the fact that
you can show me a situation where it in fact is a problem merely
serves to show that this isn't theory only. I'm certain it's not a
common use-case for subversion: most user will do other stuff than
check out their files, recode them, commit them and start all over
again with recoding. The fact that you can name more than one scenario
where this is a problem also doesn't mean the chances of running into
the problem are very big.
The same with your move example: The only way file1 and file2 will
have a big chance of having the same timestamp, is when they were
checked out or updated at the same time. This means the file can
trivially be copied from the repository again and the copy operation
can be repeated easily. With the current status in trunk, these files
would also have to be exactly the same size.
The fact that Windows can be detected isn't my point. My point is that
if a certain solution doesn't work on Windows, it doesn't work on half
the Subversion installs, maybe more. If the problem is as bad as you
describe, it can't be acceptable to introduce a solution which doesn't
work for that many users now, can it.
> And after finding rare bugs (including one that existed for at least
> 15 years in a standard Unix utility), I won't be surprised if such a
> problem occurs sooner or later here.
I have no idea what you mean with the above paragraph. If we decide
not to extend our heuristic, than the resulting situation is by
design. Not a bug. I'm also not denying it can happen. What I'm saying
is that the situation will be rare enough not to cause significant
problems.
Bye,
Erik
Received on Sat Mar 24 17:20:04 2007