[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Efficient and effective fsync during commit

From: Ivan Zhakov <ivan_at_visualsvn.com>
Date: Fri, 29 May 2015 17:14:40 +0300

On 28 May 2015 at 20:47, Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> wrote:
> Hi all,
>
> Most of us would agree that way we fsync FS changes
> in FSFS and FSX is slow (~10 commits / sec on a SSD,
> YMMV) and not even all changes are fully fsync'ed
> (repo creation, upgrade).
>
The first question is it really a problem? I mean that usually commits
are not that often. They are maintenance tasks like 'svnadmin load'
that perform commits very often, but it could be fixed with
'--fsfs-no-sync' option to 'svnadmin load' like we had for BDB.

> From a high-level perspective, a commit is are simple
> 3-step process:
>
> 1. Write rev contents & props to their final location.
> They are not accessible until 'current' gets bumped.
> Write a the new 'current' temporary contents to a temp file.
> 2. Fsync everything we wrote in step 1 to disk.
> Still not visible to any other FS API user.
> 3. Atomically switch to the new 'current' contents and
> fsync that change.
>
> Today, we fsync "locally" as part of whatever copy or
> move operation we need. That is inefficient because
> on POSIX platforms, we tend to fsync the same folder
> more than once, on Windows we would need to sync
> the files twice (content before and metadata after the
> rename). Basically, missing the context info, we need
> to play it safe and do much more work than we would
> actually have to.
>
> In the future, we should implement step 1 as simple
> non-fsync'ing file operations. Then explicitly sync every
> file, and on POSIX the folders, once. Step 2 does not
> have any atomicity requirements. Finally, do the 'current'
> rename. This also only requires a single fsync b/c
> the temp file will be in the same folder.
>
> On top of that, all operations in step 2 can be run
> concurrently. I did that for FSX on Linux using aio_fsync
> and it got 3x as fast. Windows can do something similar.
> I wrapped that functionality into a "batch_fsync" object
> with a few methods on it. You simply push paths into it,
> it drops duplicates, and finally you ask it to fsync all.
>
I didn't find any documentation that calling FlushFileBuffers() on one
handle flushes changes (data and metadata) made using other handle.
I'm -1 to rely on this without official documentation proof. At least
for FSFS.

-- 
Ivan Zhakov
Received on 2015-05-29 16:15:56 CEST

This is an archived mail posted to the Subversion Dev mailing list.