On Thu, Apr 28, 2011 at 5:10 AM, Nico Kadel-Garcia <nkadel_at_gmail.com> wrote:
<More Liberal Snipping for attempted brevity...>
> According to the paper, you *are*. You're mirring the backend
> Subversion databases on the multiple servers, keeping them
> synchronized by accepting only authorized transactions on a designated
> "master" and relaying them to the other available servers as
> necessary. That's actually master/slave behind the scenes: the slaves
> effectively passthrough the database submissions. This is built into
> every major multiple location database or service for the last.... I
> dunno, 30 years? It's certainly fundamental to dynamic DNS and NTP
Mirroring a filesystem is a very different replication technique to the one
WANdisco replays every write command that the client sends to their local
node against the other Subversion front-ends, just as the client sent it.
This happens after obtaining an agreement, but before the transaction
reaches Subversion. We're not mirroring the backend of Subversion although
the effect is the same in that each Subversion repository remains identical
after the identical changes are applied.
In a majority quorum there is NO master server - I'm not sure how you're
reading the whitepaper, but behind the scenes or not there isn't any master
being defined (although it is an option if people choose it - See Singleton
Quorum). There is the concept of a 'distinguished' node which can act as a
tie breaker in the event of a 50/50 split in the agreement, but the
situations where that's invoked are pretty rare unless you only have two
nodes to start with.
> You've renamed the categories of service, but that's clearly the
> underlying techonology.
As I say, I beg to differ here. I'm happy to keep discussing, but it might
be quicker if you let me just show you on a real live platform. I'm open for
that if you have time?
Right. Now make it 3 sets of 3, with each set distributed in different
> locations. *In each location*, the set of 3 can vote amongst
> themselves and go haring off in divergence from the other 6, or even
> two other sets of 3. Unless you prescribe that each distrubed set must
> vote among all *9* servers, and get a majority,
Yes, we prescribe 9 nodes here. There is no option for local agreement. If
you have 9 nodes *in the same replication group* then the majority quorum is
between all of them. You need 5 nodes to agree on a transaction for a global
sequence number to be generated and the transaction allowed. You can run
multiple replication groups for more complex requirements, but I think thats
way outside the scope of this discussion.
you're in danger of
> local sets diverging. And when some idiot in a data center says "huh,
> we're disconnected from the main codeline, we need to keep working,
> it's active-active, we'll just set our local master and resolve it
> later"...... And until the connectivity is re-established, any cluster
> chopped off is otherwise read-only. The can commit *nothing*.
Users at a node which can't get an agreement for their commit can't commit.
That's a lot better than if they are using a remote server and lose the
connection, at which point they can neither read or write. WANdisco users
get a clear message in the event they try to commit when a quorum isn't
The part I take issue with here is the idiot at the datacentre. The fact is
that people don't invest in a solution like WANdisco and then allow idiots
in datacentres free access to the server to break the replication group. Not
to mention that's it's deliberately not an easy process to perform (see
Even worse: Unless you have a designated master cluster, losing the
> client clusters means the company's core Subversion services at their
> main office are read-only if the network connections to enough remote
> clusters break. There are environments where this is acceptable, but
> if I ever installed a source control system that went offline at the
> main offices because we lost overseas or co-location connections,
> *which happens when someone mucks up the main firewall in the
> corporate offices!!!*, they'd fire me without blinking the first time
> it happened.
It's a scenario that can be easily planned for. Often the main site will
have multiple local nodes (maybe with a load balancer in front for high
availability) so that a quorum can still be reached should the link fail. In
other scenarios people choose different quorum options to match their
requirements. I've not come across a situation yet where a whiteboard and a
little bit of forward thinking can't deal with any proposed scenario people
want to throw at us. Of course we don't pretend to change the laws of
physics. All of your global datacentres and Subversion
replicas spontaneously combusted? Maybe now it's time to find that backup
Until some idiot resets the quorum targre list locally. That's not a
> software protection, it's a procedural one.
In read-only mode, sure. That's how DNS slaves, NTP slaves, and "MMM"
> or "MySQL-Master-Master" works. The problem is the remote idiot who
> activates write access to their local quorum. There is no defense
> against this, except to throw a screaming hissy if it happens, and
> ensure that *every working copy taken from the split-off repository is
> entirely rebuilt from scratch*. And Subversion servers simply have no
> reliable record of where the working copies are to enforce this.
There is a perfectly good defence against this, and yes it's procedural. But
it's the same defence as not allowing the silly admin the ability to type rm
-rf * as root on a production server he thinks isn't, or 'drop database
everything;'. Perhaps not the best examples, but surely you accept it's a
silly point in a reasonably locked down enterprise environment with properly
> Needed? No, not if you're willing to leave your remote cluster in
> read-only mode for an indefinite period until the VPN or network
> connection can be re-established to rejoing it to the distributed set
> of clusters. That's likely to kill remote software productivity for
> hours, if not days. I've had VPN wackiness last for *weeks* due to
> bureaucratic befuddlement.
Again, you have to bear in mind our audience and the sort of customers we
work with. We do have customers with very difficult connections to one or
more sites globally. That doesn't affect the general usage of the platform
though and in fact it's those users who often benefit most from a local
WANdisco instance which cuts a load of read traffic off the network and
provides fast reliable access to the local server.
> There is a sane fallback in that situation. Replicate the service to
> an alternative backup with a different UUID, tell developers to use
> that one in the short term, and provide assistance migrating their
> changes to the primary repository when write operations are available.
> It's painful, but doable.
I could point you to plenty of people who wouldn't find that acceptable. If
you have 20,000 developers and thousands of commits a day you simply can't
put yourself in that sort of position I'd say.
> *Wrong*. As soon as a manager of an individual node can designate it a
> master with write permission, separated from the rest of the network,
> chaos is guaranteed. And you *CANNOT* hardcode the full set of nodes,
> because nodes have to be replacable or discardable.
> See above. That quorum agreement is at risk from local "quorums".
Hopefully now you see why that's not true?
> you've got some kind of transaction checksum stored with each
> Subversion databse transaction to check for discrepancies, it's at
> risk for discrepancies to circulate, for that split brain situation
> under such circumstances.
Yes. Every transaction (ie Each WebDAV change sent by the client) is
replicated with a checksum which is transmitted as part of the agreement and
> Sadly, I've seen this sort of thing happen with other databases,
> especially involving sensitive and complex information, that are not
> well managed.
There are 300+ Enterprise users of our products today who represent many of
the largest Subversion deployments in the world and who have never seen this
sort of issue. But of course if we're talking about badly managed
deployments, they are probably being run by people who aren't talking to
Chief Solutions Architect
Received on 2011-04-28 15:18:55 CEST