Sunday, June 10, 2018

DCI - VPC vs VXLAN

Hi All,

I'm doing a bit of design work and I need to make a call on a technology for a DCI between 2 DCs <100km apart. I've been reading through white papers on the technologies but wanted to get input from those who have used them in the field. Neither are designed specifically for DCI but I have seen validated designs that use them. I know stretched L2 is the devil and I understand I'm spreading the fault domain even with taking STP out of the equation. This design takes into consideration a single availability zone spread across 2 sites.

  • Each site utilises UCS and TOR Fabric Interconnects for running VMs. Some of these are virtual firewalls in Active/Standby
  • Active Virtual firewalls need to peer with SVIs on L3 core switches at each site (otherwise L3 Cores may not be needed as IP WAN could come into an aggregation switch at each site but I don't want to deal with routing protocol peering over VPC. No idea if this even works with anycast gateways in VXLAN).
  • Aggregation switches are purely L2 (except for VXLAN underlay in Design 1.)

Design 1: Use VXLAN with MP-BGP EVPN control plane to span L2 between sites. https://imgur.com/6ZdErEH

  • DCI Links are L3 P2P between Nexus 9ks
  • Don't require the scale of a full on spine/leaf architecture. Only need 2 VTEPs at each site.

Pros:

  • Routed L3 links between sites. Get rid of STP
  • Can use routing protocols for load balancing.
  • BFD for speedier convergence over DCI.
  • VXLAN EVPN is an industry standard

Cons:

  • More complex to setup, make changes and to troubleshoot (arguably)
  • Potentially issues with HA keepalives for firewalls over VXLAN (no experience just some of the stories I've heard. Keen to get feedback from other's with this kind of setup)

Design 2: Back to back VPC with separate STP Domains to span L2 between site https://imgur.com/NgrUjqJ

Pros:

  • Easier for initial setup and to make day to day configuration changes.

Cons:

  • Potentially more failure scenarios and load balancing strangeness when dealing with VPC
  • Need to be careful of physical medium of DCI, no BFD so need to look at fast LACP timers etc.
  • Potentially issue with port-channels and out of order packets if physical path lengths are different (DCI medium will be physically diverse)

Any gotchas that pros in the enterprise networking space can see with either of these designs?

I'm used to working with Nexus 5000s and 7000s. How does the 9300 platform stack up against these? Never had any issues with VPCs on the 5500s or MPLS/VRFs/Large L3 tables on the 7000s. The 93180YC-EX and FX2 switches have peaked my interest. Keen to get away from big chassis switches as soon as possible.

Regarding VXLAN:

  • Is it still recommended to run multicast even with EVPN for BUM traffic?
  • Most whitepapers and examples I've seen are built around the leaf spine architecture. In this design would it better to use iBGP or eBGP between aggregation switch pairs?
  • There seems to be a lot of documentation and whitepapers ranging back from 2014 on VXLAN. If anyone has some favourite up to date documentation on VXLAN EVPN deployment on the 9300 platform that would be most appreciated.


No comments:

Post a Comment