[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: FSFS Repository corruption on high load on Windows (was Re: svn commit: r1082451 - /subversion/trunk/subversion/libsvn_subr/io.c)

From: Ivan Zhakov <ivan_at_visualsvn.com>
Date: Fri, 17 May 2013 18:02:21 +0400

On Tue, May 14, 2013 at 9:06 PM, Ivan Zhakov <ivan_at_visualsvn.com> wrote:
> On Thu, Mar 17, 2011 at 4:21 PM, <rhuijben_at_apache.org> wrote:
>> Author: rhuijben
>> Date: Thu Mar 17 12:21:29 2011
>> New Revision: 1082451
>>
>> URL: http://svn.apache.org/viewvc?rev=1082451&view=rev
>> Log:
>> On Windows avoid a file flush that is taking over 15% of the checkout time.
>>
>> * subversion/libsvn_subr/io.c
>> (svn_io_write_unique): Avoid a very costly file flush on Windows.
>>
>
> [...]
>
>> --- subversion/trunk/subversion/libsvn_subr/io.c (original)
>> +++ subversion/trunk/subversion/libsvn_subr/io.c Thu Mar 17 12:21:29 2011
>> @@ -3199,8 +3199,17 @@ svn_io_write_unique(const char **tmp_pat
>>
>> err = svn_io_file_write_full(new_file, buf, nbytes, NULL, pool);
>>
>> - if (!err)
>> + /* ### BH: Windows doesn't have the race condition between the write and the
>> + ### rename that other operating systems might have. So allow windows
>> + ### to decide when it wants to perform the disk synchronization using
>> + ### the normal file locking and journaling filesystem rules.
>> +
>> + ### Note that this function doesn't handle the rename, so we aren't even
>> + ### sure that we really have to sync. */
>> +#ifndef WIN32
>> + if (!err && nbytes > 0)
>> err = svn_io_file_flush_to_disk(new_file, pool);
>> +#endif
>>
> Hi Bert,
>
> How did you come to conclusion that Windows doesn't have race
> condition between write and rename?
>
> It seems Windows actually have race condition between write and move
> in case of unexpected computer shutdown. In this case move operation
> completes after restart, but file contains zero data. I understand
> that you intention was optimize client side operation but
> svn_io_write_unique() function actively used in FSFS and lead
> repository data corruption.
>
> I've created small program to demonstrate this problem (see
> attachment). On start it create four threads actively writing data to
> disk and one thread that uses write + move to update file atomically.
> If you run this program on virtual machine and then reset then with
> very good chance you'll get 'current' file with zeros.
The problem happens because file write and move operations are *not*
serialized. They are performed independently. In normal conditions
it's ok to write after move to new file name, but we got file with
zero's in case of unexpected failure.

I've reverted r1082451 in r1483781 and nominated to backport to 1.7.x
and 1.8.x. I've looked through other usages of svn_io_file_rename()
in FSFS code and find other cases when such corruption could happen.

-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com
Received on 2013-05-17 16:03:17 CEST

This is an archived mail posted to the Subversion Dev mailing list.