On Wed, Mar 03, 2010 at 01:54:28PM +0100, Bert Huijben wrote:
> > If two threads write to the same pristine, the content written
> > will be the same (except in case of a SHA1 checksum collision
> > which we choose to ignore). So, thread 1 writes to a tempfile, and
> > when it's done, it moves the tempfile into place. The new filename
> > of the tempfile being based on the SHA1 sum of the written content.
> > If thread 2 does the same concurrently, the end result will be the
> > same -- the file will only exist once at its SHA1 sum name.
> On posix, when using svn_io_rename_file() this would be true and this would
> be pretty safe.
> On Windows you get an access denied (bad) and a 15 second delay retrying the
> move (worse).
If renaming the tempfile to the SHA1 name fails because the SHA1-named
file exists, we can rest assured that the content of the file which already
exists matches exactly what we want the file to contain (assuming SHA1
collisions are negligible).
So if we get access denied on windows because the file exists on disk
already, we can return, knowing some other process did our job for us.
We don't need to retry. We just need to stat and see if the file exists.
Only if it does not exist we need to raise an error (because this means
there's a permissions problem or something like that).
> So we should try to avoid overwriting existing files here. (I would guess
> that tools like rsync and incremental backusp also like that we don't change
> the date of these files)
There is no need to even consider how this would affect other tools.
We're talking about a race condition, where 2 processes try to write the
same pristine at the same time because they both found it to be missing.
This won't happen often, and if it happens, the existing file being renamed
upon will only have existed for a very short time.
Received on 2010-03-03 15:01:41 CET