On Sat, May 18, 2013 at 10:55 AM, Bert Huijben <bert_at_qqmail.nl> wrote:
> I see all those easy +1’s on other operating systems...
> I assume you all reproduced the problem on Windows???
>
> Maybe also on actual hardware instead of a VM (with a VM harddisk emulation
> infrastructure with different powerfail handling)?
>
> I still see no prove that the symptoms are not in this category!
>
Original customer report was about failure on real hardware, not VM.
>
> But if you check the sqlite research: which other operating systems provide
> the same guarantees on power failure at the cost of a lot of performance?
>
> We are talking about flushing the NTFS journal to ensure everything for a
> single file is flished. Something which in multi user systems such as *nix
> really requires root permissions as it allows trashing performance for the
> entire system.
>
> Safety is a nice property, but you can’t get it via just these flushes.
> Usability is also important. And the rest of the system needs the same power
> safety security principles for these flushes to make sense. And then only in
> critical places, not after every small tempfile write.
>
> E.g. part of the design. Not as part of a low level function.
I agree with you that flushing should be part of the design, not of
low level function. But that how current code works and you changed
behavior of low-level function in r1082451. Even more: you made
behavior of low-level function platform-depended. That's why I
reverted this change. Proper fix should be introduce
svn_io_write_atomic() and svn_io_atomic_stream_create() with full sync
flag. And also add explicit DISK_FLUSH flag to svn_io_write_unique()
instead of making behavior depended of DELETE_WHEN flag.
> If we are going this way we can stop all the fsfs v2 development and
> optimizations for our biggest market. If we go this way we are going to be
> several orders of magnitude slower anyway for fixing a few of our power loss
> issues. There is no use of shaving a few % in other places.
>
I don't think that FSFS performance is important thing. Development of
our own high-performance database with access from multiple process is
not what Subversion developers should be focused. Moves and merges are
much more important.
(And I don't say about our own cache server :)
> Not every filesystem has the performance characteristics of ext2; a system
> without journal.
> This is moving back to the simplistic “Windows is slow” world we had around
> 1.5 before I joined the development.
>
You that 1.5 slowness had different reasons: multiple .svn and our own
entries file.
> Looking at the number of corruptions reported over the past 4 years. How
> many users would be happier if the repository and/or working copy would be
> something like 400% slower to make it ‘somewhat less likely to corrupt on
> power failure’?
>
I agree that working copy corruption are not so important, but they
are important. Do you ready to lose several complex changes in source
code for 400% performance win? You can always open Hard Disk
properties, then Polices and check "Turn off Windows write-cache
buffer flushing on the device" and performance even you're ready to
trade it for working copy corruptions.
--
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com
Received on 2013-05-20 09:35:50 CEST