On Sun, 2005-10-30 at 12:45 -0600, Jonathan Gilbert wrote:
> At 10:48 AM 30/10/2005 -0500, Greg Hudson wrote:
> >At any rate, standard practice for logfiles is to open them in append
> >mode and write out each log message in a single write(), which is
> >guaranteed to be atomic. If we do that, then multiple handles to the
> >same file due to symlinks should be a resource-consumption issue only,
> >not a correctness issue.
>
> I don't see how this can guarantee what we need. While whatever a write()
> sends out is guaranteed to be atomic, the write() function is capable of
> doing partial writes, and there's no way to be certain that it won't do
> that.
I believe that for regular files on a local filesystem, write() does all
or nothing. I don't have an authoritative source, since I didn't manage
to pull up either the SUS or POSIX standards on the web easily.
(Logging to a network filesystem is always going to be a little dodgy.
Not much we can do about that, I think.)
> The only way I can see to be *absolutely sure* is to use an OS-level
> synchronization function. Since cross-process synchronization is likely
> more expensive that in-process synchronization, perhaps this should be
> selected at startup based on the user's choice of connection mode (threads
> vs. fork).
[...]
> Another minor issue is that the Linux man page for write() (section 2)
> indicates that there exist filesystems where write() doesn't even guarantee
> that space on the device has been reserved for the device, let alone that
> the data has been written. If the two file handles don't know about each
> other, then we also need to fsync() after every write(), and this creates a
> race condition in the absence of synchronization.
If a filesystem gives up on returning an error from write() on disk
full, then pretty much every application is sunk when it comes to
graceful filled disk recovery. Subversion won't be unusual in that
regard.
If the filesystem has not actually written out the data, that doesn't
mean the kernel isn't ensuring proper append semantics. POSIX requires
that when write() returns, the data is reflected by a subsequent read()
in any process, and local filesystems generally conform to that
constraint.
At any rate, locking or fsyncing for each log message would be a
performance killer, so even if there are edge cases for simply opening
for append and writing, I think we should do it anyway.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Oct 31 07:51:19 2005