Wednesday, February 24, 2021

BGP routing on the VM or LBs in active-active DC

We're planning to have only two DCs, 5-10ms latency between them. How would you do active-active for services? I'm thinking EVPN/VXLAN would be quite hard, as there's always the issue of how to route towards the local GW (FW) and not tromboining the traffic. Currently we have 4, 2 in each location and we're doing some sort of vmware clustering spread to both DCs to get "high availability" (in quotation marks as this has failed several times causing VMs to lose disks and causing long downtimes...)

I'm thinking I would have two options: LB (currently F5, maybe HAProxy in future as we're doing super simple stuff) advertising /32 towards our network and each server having two NICs, one for DC 1 and one for DC 2. When server is in DC 1, it has 192.168.1.0/24 (for example) connected, and in DC 2 it has 192.168.2.0/24 connected. Based on these LB would do AS prepends so the correct LB would get the traffic.

Other option would be to run BGP on the hosts. Configure both 192.168.1.1 and 192.168.2.1 as BGP neighbours and then configure something like 192.168.99.x/32 as the "floating service IP" that the services listen to and always is the same no matter which DC the VM is in.

If we have one VM in each DC I guess the options are pretty much the same? Our problems are with availability, not with performance of a VM.

Any thoughts? Thanks!



No comments:

Post a Comment