On Wed, May 20, 2015 at 1:40 PM, Bert Huijben <bert_at_qqmail.nl> wrote:
>
>
> > -----Original Message-----
> > From: stefan2_at_apache.org [mailto:stefan2_at_apache.org]
> > Sent: woensdag 20 mei 2015 13:33
> > To: commits_at_subversion.apache.org
> > Subject: svn commit: r1680528 - /subversion/branches/fsx-
> > 1.10/subversion/libsvn_fs_x/transaction.c
> >
> > Author: stefan2
> > Date: Wed May 20 11:32:42 2015
> > New Revision: 1680528
> >
> > URL: http://svn.apache.org/r1680528
> > Log:
>
> >
> > @@ -3316,7 +3320,7 @@ commit_body(void *baton,
> > const char *revprop_filename, *final_revprop;
> > svn_fs_x__id_t root_id, new_root_id;
> > svn_revnum_t old_rev, new_rev;
> > - apr_file_t *proto_file;
> > + apr_file_t *proto_file, *revprop_file;
> > void *proto_file_lockcookie;
> > apr_off_t initial_offset, changed_path_offset;
> > svn_fs_x__txn_id_t txn_id = svn_fs_x__txn_get_id(cb->txn);
> > @@ -3437,6 +3441,12 @@ commit_body(void *baton,
> > /* Move the revprops file into place. */
> > SVN_ERR_ASSERT(! svn_fs_x__is_packed_revprop(cb->fs, new_rev));
> > SVN_ERR(write_final_revprop(&revprop_filename, cb->txn, txn_id,
> subpool));
> > +
> > + SVN_ERR(svn_io_file_open(&revprop_file, revprop_filename, APR_READ,
> > + APR_OS_DEFAULT, subpool));
>
This code will *not* work on Windows b/c flushing needs write
access to the file handle. A copy-n-paste bug on my part.
Fixed in r1682281.
> > + SVN_ERR(svn_io_file_flush_to_disk(revprop_file, subpool));
> > + SVN_ERR(svn_io_file_close(revprop_file, subpool));
>
> Perhaps this works on some platforms, but it really depends on which layer
> the information is cached before flushing.
>
All our fsync (I use that term generically for "flush-to-actual-disk")
operations are best-effort anyways. However, we know that we
don't have SVN-internal buffers there and that APR clean its
buffers and then calls the OS to flush its buffers.
We know that all the OS can do is to kindly ask the HW to maybe
consider actually writing data to non-volatile storage and that there
is no real guarantee that it will survive a power outage - or worse:
a hardware failure. But we do push data closer to the HW (less
latency until the actual write) and we add some sort of delay before
declaring to the world that e.g. a new revision has been born.
The combination of both should at least reduce the time window
in which we would see inconsistent data instead of just a failed
commit.
>
> [[
> (commit_body): Flush the revprop contents here.
> ]]
> Opening a file, flushing it and closing it isn't guaranteed to flush the
> information written when the file was opened earlier on.
>
Why not? According to MSDN, FileFlushBuffers flushes the
(cache) buffers for the specified file, not file handle. MSDN's
description of the file cache says that data is associated with
the kernel file *object* - not the handle. This is in line with
orthodox OS design.
Do you have information on the contrary? I spent about two
days last weekend doing research on how Windows is handling
things and anything reliable is hard to come by. So, my
assumptions may be wrong.
> I would be surprised if this actually performed a hardware flush on
> Windows, as generally things are attached to handles there... And if
> nothing is written to the handle, nor can be written, there is not much to
> flush to lower layers.
>
The only reasonable interpretation of what I have found is
that file handles under Windows are also just fancy and
sometimes heavily decorated pointers to shared kernel
objects. For example, the documentation to CreateFile
https://msdn.microsoft.com/en-us/library/windows/desktop/aa363858%28v=vs.85%29.aspx#caching_behavior
says
"A write-through request via *FILE_FLAG_WRITE_THROUGH* also causes NTFS to
flush any metadata changes, such as a time stamp update or a rename
operation, that result from processing the request. For this reason, the
*FILE_FLAG_WRITE_THROUGH* flag is often used with the
*FILE_FLAG_NO_BUFFERING* flag as a replacement for calling the
*FlushFileBuffers*
<https://msdn.microsoft.com/en-us/library/windows/desktop/aa364439%28v=vs.85%29.aspx>
function ..."
The "rename operation" makes only sense if it applies to
previous changes because MoveFile operates on paths and
IIRC there is no other way to rename a file.
> On top of that close, directly followed by open is generally slow on
> Windows, because virusscanners will access the file between these
> operations. Closing the file just once should perform much better and will
> avoid retry loops.
>
I agree that the current workflow is horribly inefficient but
I think we can change that. I'll talk about that in a separate
thread.
-- Stefan^2.
Received on 2015-05-28 19:06:07 CEST