Re: Efficient and effective fsync during commit

From: Philip Martin <philip.martin_at_wandisco.com>
Date: Fri, 29 May 2015 11:20:36 +0100

Stefan Fuhrmann <stefan.fuhrmann_at_wandisco.com> writes:

> On Thu, May 28, 2015 at 9:54 PM, Philip Martin
> <philip.martin_at_wandisco.com> wrote:
>>
>> fsync() works on file descriptors rather than files, do we need to keep
>> the original file descriptors open in order to fsync()?
>
> We could b/c there are at most 7 (4 files, 3 folders) of them for a
> FSFS commit, but this is not necessary. Since it would imply
> keeping them open during renames, we could no longer use
> plain APR calls - i.e. extra code churn.
>
> If your interpretation was correct, fsync'ing a directory would
> only work if you modified that directory file through its descriptor -
> which you simply can't. Also, it would mean that our protorev
> file handling was broken: We open & close that file for every
> PUT, re-open it during commit, append the structure data and
> fsync only through the last file handle.

It's not my interpretation as such, I just want us to be clear about
the assumptions we would be making.

I suppose it is possible that our protorev handling is broken on some
filesystems. It is also possible that some filesystems handle
directories and files in totally different ways: some sort of COW tree
for directories and a list of blocks for files. Using the behaviour of
fsync on directories is not necessarily a good way to predict the
behaviour of fsync on files. There is no mention of directories in the
POSIX description of fsync, unlike that of open.

If we consider a directory fsync after a rename then there is more to do
than just identifying which disk blocks store the directory; the rename
may have affected two directories. When the rename affects two
directories if fsync on one is to flush the other then the filesystem
must either do a complete metadata flush or store some sort of pointer
to the other directory. I don't think any of this is a problem for our
current Subversion code but it does illustrate that directory fsync is
not necessarily a model for file fsync.

-- 
Philip Martin | Subversion Committer
WANdisco // *Non-Stop Data*

Received on 2015-05-29 12:21:35 CEST

This message: [ Message body ]
Next message: Stefan Sperling: "[RFC] new svn_client_conflict API"
Previous message: Johan Corveleyn: "Re: Populating the rep-cache"
In reply to: Stefan Fuhrmann: "Re: Efficient and effective fsync during commit"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ] [ by messages with attachments ]