Monday, February 24, 2020

BGP Design Critique / Questions

Apologies in advance, this is a long one. I have an opportunity to do a (relatively) green-field BGP deployment for a small data center environment (3 locations, smallest 2 racks, largest 6 racks). I say relatively because it's one net-new greenfield deployment, and 2 existing locations that will need to be retrofitted. Those two locations, however, currently only leverage BGP to peer with ISPs, receive a default route, and announce one /24 per site. I've settled on what I think is the approach I want to take, and am looking for people to shoot holes in the design (be gentle) / offer suggestions and answer a couple of questions. So here goes...

Each site has two ISP-facing routers (Cisco ISR 44xx line), each peering with 2 ISPs, which connect into a pair of public switches to aggregate router / firewall / VPN concentrator connections. The new location will be running in a leaf/spine design with VxLAN and EVPN (overkill, perhaps, but it's the proof-of-concept to roll to other locations), but this isn't really what I want to focus on. It's relevant only because what would otherwise be the "public" switches will be doing double duty, serving as 'spines' between the two cabinet ToR's and exit switches to the ISP routers. The exit switches will be the default gateway for the publicly routed /24 that each site has. To this end, I'll be running a 'wan' VRF for the WAN routing. There's also a VPLS circuit in the mix to provide connectivity between the three locations, which I plan on terminating into the aforementioned exit switches.

Since a picture is worth a thousand words...

https://ibb.co/8DBXmGn

Let's assume AS 1234 is our public ASN, and 80.80.80.0/24 is the public IP block for this site.

  • Exit switch 1 and 2 each peer with the corresponding router
  • One of the exit switches (since it's single hand-off) will peer with the exit switches in the other locations via the VPLS circuit
  • Exit switch 1 and 2 each have a static null0 route for 80.80.80.0/24, and announce that subnet to the routers, as well as the other locations via the VPLS circuit
  • Exit switches at other sites will send their local public subnets across the VPLS
  • The routers advertise a zero route to the exit switches
  • The routers receive a zero route from the ISPs along with, possibly, ISP customer subnets
  • The exit switches advertise a zero route to each other, but with a lower local preference, to ensure that if a given switch loses its upstream router that it still has connectivity, but that the direct router / ISP connection is always preferred

I think that covers the highlights, so aside from general feedback, a few targeted questions:

  • Ideally, the routers will only announce a zero route to the exit switches if they themselves actually have a zero route, so that traffic is not blackholed. What's the best way to do this? neighbor x.x.x.x default-originate route-map <whatever>?
  • I'd like to take this opportunity to implement community strings, mostly so that they're already in place should I want to do anything with them down the road. I have - I think - a pretty good idea of how I want to lay them out, for example:
    1234:0 - Global, received from any ISP (for example, a zero route or ISP customer route)
    1234:1 - Global, originated by any exit switch (for example, 80.80.80.0/24)
    1234:2 - Global, received across a private WAN circuit (i.e. VPLS)
    1234:100 - Networks originating in site 1
    1234:101 - Networks originated by the exit switches, in site 1
    1234:102 - Networks sent across private WAN from site 1
    Rinse and repeat x00,x01,x02 for each site. Thoughts? Presumably I'd want to add these communities as close the source as possible. So if for example the network 80.80.80.0/24 is announced via the network keyword, appending a route map there to add 1234:101 as well as 1234:1? And an outbound route-map for any VPLS peers to append 1234:2 and 1234:102?
  • I've started working with the idea that anything "WAN" facing would leverage the public ASN, and anything internal (i.e. the fabric) would leverage a private ASN. Where does it make sense to draw that line? Should the routers vs. the exit switches be the "inner most" AS 1234 devices?

If you've made it this far, thank you! I'm looking forward to any thoughts you may have!



No comments:

Post a Comment