[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r1483795 - /subversion/branches/1.8.x/STATUS

From: Branko Čibej <brane_at_wandisco.com>
Date: Sat, 18 May 2013 10:51:40 +0200

On 18.05.2013 08:55, Bert Huijben wrote:
> I see all those easy +1’s on other operating systems...
> I assume you all reproduced the problem on Windows???

What the blazes are you going on about, Bert? Justin's +1 was for
putting repository integrity before performance. That has nothing at all
to do with any specific platform.

> Maybe also on actual hardware instead of a VM (with a VM harddisk
> emulation infrastructure with different powerfail handling)?
>
> I still see no prove that the symptoms are not in this category!

Ivan's analysis of the failure mode is correct, and it has nothing to do
with disks, emulated or otherwise, because it is a side effect of what
happens before the data hits the disk driver layer.

> But if you check the sqlite research: which other operating systems
> provide the same guarantees on power failure at the cost of a lot of
> performance?

The SQLite docs clearly spell out what happens if you don't turn on the
whole sync'd WAL. The Subversion repository is not an application-level
persistent store, it's a centralized repository that we /know/ has to be
as robust as we can make it.

> We are talking about flushing the NTFS journal to ensure everything
> for a single file is flished. Something which in multi user systems
> such as *nix really requires root permissions as it allows trashing
> performance for the entire system.

Excuse me? fsync() requires root? That's a new one.

> Safety is a nice property, but you can’t get it via just these
> flushes. Usability is also important. And the rest of the system needs
> the same power safety security principles for these flushes to make
> sense. *And then only in critical places, not after every small
> tempfile write.*
> **
> *E.g. part of the design. Not as part of a low level function.*

Are you saying that it's OK to have a known repository corruption for
the sake of not having users breathe down our necks about performance on
Windows?

> If we are going this way we can stop all the fsfs v2 development and
> optimizations for our biggest market. If we go this way we are going
> to be several orders of magnitude slower anyway for fixing a few of
> our power loss issues. There is no use of shaving a few % in other places.
>
>
> Not every filesystem has the performance characteristics of ext2; a
> system without journal.
> This is moving back to the simplistic “Windows is slow” world we had
> around 1.5 before I joined the development.

So why, in your opinion, is it OK to flush the file buffers on every
platform /except/ Windows? Are you saying that sysadmins on other
systems don't use journalled filesystems?

(Interestingly enough, that /is/ the case when we're talking about
storage for databases such as Oracle or Postgres, which have their own
journalling and logging built in. But a Subversion repository doesn't
even come close to that level of robustness on its own.)

> Looking at the number of corruptions reported over the past 4 years.
> How many users would be happier if the repository and/or working copy
> would be something like 400% slower to make it ‘somewhat less likely
> to corrupt on power failure’?

You're talking through your hat. You do not have a single data point
that supports your 400%, and moreover you haven't the any data about the
relative importance of Windows as a Subversion /server/. But that's all
beside the point.

Subversion's promise from day one is that it will not lose data (barring
bugs). Ivan fixed a bug that can cause data loss. It's as simple as
that. Moreover, the bug was, I suspect, caused by a misguided attempt at
improving Windows performance without considering the consequences.

You know as well as anyone else that we could make FSFS /much/ faster if
we didn't worry about atomically installing revision files. But
performance has always been secondary to robustness, in both the server
side /and/ the working copy.

Instead of screaming at people about performance loss, a more
constructive approach would be to propose an alternative solution that
fixes the same bug differently, but just as completely.

-- Brane

-- 
Branko Čibej
Director of Subversion | WANdisco | www.wandisco.com
Received on 2013-05-18 10:52:25 CEST

This is an archived mail posted to the Subversion Dev mailing list.