Friday, July 5, 2019

Looking for advice on how or whether to use LAGs between non-stacked pairs of Dell/F10 switches

See diagram here: [Imgur](https://i.imgur.com/mDbpyoc.jpg)

I'm designing and building out a new data center using a pair of Dell S6010s for the core (all 40GE) and Dell S4048s for top-of-rack (40GE uplinks, 10GE downlinks), and have decided to not use stacking for any pairs of switches, due to the instability and possible outages that would occur with switch member reboots and upgrades. My goal is to have a L2 network with no single points of failure, but still present aggregated links of some sort to the downstream servers.

As you can see in the image linked above, we're using a spine-leaf topology, with the core switches and top-of-rack switch pairs linked together with a high speed interconnect between them (RSTP enabled, with one core switch configured to be the preferred root bridge), and leaving each pair as independent, non-stacked switches. This way, each switch retains its control plane and can survive its peer switch dying or being rebooted. The downstream servers are going to be a mix of Windows 2012R2, Ubuntu Linux 18LTS, and ESXi 6.5+. These systems need to run LACP or similar protocol with the ToR switches that can ensure quick link failover in case one of the ToR switches stop forwarding. I can't wait for the usual spanning tree reconvergence time, as we have a strict 3-second outage limit for our application. I'm also afraid standard "active/standby failover" bonding at the OS level wouldn't account for a situation where a ToR switch "locks up" and stops passing L2 traffic but still presents L1 link to the downstream server. Ideally, I need to ensure the systems can detect when the upstream switch on its respective link has stopped passing traffic or stopped sending BPDUs and mark that link as offline/failed within 1-2 seconds.

I can't do LACP between the Top-of-rack switch pairs and the downstream servers, because LACP would require the ports upstream from each server all be in one control plane (like a stacked pair), and I can't do VLT (virtual link trunking) between the core switches and ToR switches because VLT requires one end of the trunk to have a single control plane. I could do VLT if either the core or the ToR switch pair were in a stack, but they're not and will never be.

Am I stuck with just trusting RSTP for path resiliency, and just giving up on link aggregation? What are my options, for switch-pair to switch-pair aggregation and switch-pair to server aggregation, using the standard protocols that Dell/F10 switches offer?



No comments:

Post a Comment