Friday, January 18, 2019

I need to explain to other senior engineers why it's bad to cluster across data centers

Monday, I'm going to be asked to explain to 25 other network engineers, including 6 or 7 other senior engineers, why it is bad to create a cluster across data centers, as opposed to have a cluster at each DC and failover between. And if I cannot sufficiently explain why, then "the way it's always been" will take precedence over our 99.999% service availability SLA.

I really wish I could say that I'm joking. But I'm not. I'm sitting here right now, just... Stunned. I feel like I'm being asked to explain why it's painful to cut off your left foot with a hack saw. All I can think of is, "because it hurts like hell!"

One of the senior engineers is deploying a firewall cluster to upgrade his aging firewalls. For years, the way we've always done it has been to buy two firewalls (or storage nodes, or VPN termination points, or anchor controllers, etc etc etc) and put one at each of our data centers in an active-active or active-hot-standby type of design. Stretch those VLANs and call it a day. No, we do not bother reading vendor best practice design documents. When someone points out that such documents generally discourage this type of design, it is literally ignored. I gave up trying to intervene many years ago, but have recently been put in a position where the expectation is that I'm to intervene and to not give up.

What would you say in this situation? Any links to hard facts that I could maybe provide? I can't really give links and say "Go read this" because it will be ignored. But if I can throw out some facts and possibly even specific examples, I may have a chance.

Thoughts?



No comments:

Post a Comment