Monday, January 15, 2018

Need help with an OSPF/BGP route selection issue

Hey everyone, hopefully I can explain this correctly, its a little convoluted but I can follow up with more details if needed. I will start with a brief description of the topology I am working with here.

We have a Corporate Datacenter in Seattle(SEA) and London(LHR), and multiple sites world wide and they are all in their own AS, and the basic setup is we have two border routers, router 1 is our DIA/Tunnel router, and router 2 is our WAN/MPLS router. We also have a production network, they used to only peer with SEA. Our routers have 4 VRFs (internet, production, development, default/trust), and there is OSPF in each VRF that peers with our Firewall, and changes zone/vrf there. All corporate sites are connected via MPLS on router 2, in the trust zone/vrf.

In SEA we peer via BGP with all production sites, via IPSEC in the Production VRF on router one. It redistributes to OSPF travels to the firewall, zone transfers to trust, and then goes from OSPF in trust to BGP out over MPLS to all the sites.

This was working great, even with the design flaw present that any route coming in over the non-default VRF basically loses its AS-path due to being put into OSPF through the firewall (not my design but im making it work :D ).

So in an attempt to build some redundancy and prepare for a tighter integration between our teams (corp and prod), we are turning up more paths between us. This is where I am hitting a strange issue. I will focus on the Prod route 172.28.0.0/16, also known as ASH. Also to note is TUK with is another prod site, the one SEA peers with, which is sending us the ASH routes ASH>TUK>SEA.

We put in an ASH to LHR IPSEC tunnel and peer BGP over it in the production VRF and accept the route 172.28.0.0/16. Because I wanted LHR to prefer the direct connection to ASH over going to Seattle then ASH, I set the metric on the OSPF route from the ASH>LHR peer to 1, and the metric on the ASH>TUK>SEA>LON route to 250. This made it so people who VPN into LON use the direct tunnel to ASH.

So we have an office in London which I will call LON. LON see's the route to ASH through the MPLS to SEA, so in LHR I add the ASH route to the OSPF into BGP redistribution in default/trust, and then add it to the prefix list of what LHR sends into MPLS, and I use the community string from our IPVPN provider to say (in europe prefer this route). This community simply tells Level3 that on their europe routers install and announce this, versus the routes announced in the Americas that have a different community string.

The problem I have is the in LHR, the BGP in trust/default isn't installing the local prod route from OSPF, it is still selecting and installing the route from SEA. And even more interesting, router 1 route table in default/trust has the route to ASH via LHR, and router 2 route table in default trust has the router to ASH via SEA.

Some quick info before I post these show commands: Corp = 10/8 Prod = 172.16/12

LHR = 64544 SEA = 64664 L3(MPLS) = 3549 ASH = 65107 TUK (prod site SEA peers with) = 65000 

Here is LHR-R1 showing the ASH route from the tunnel locally being used:

LHR-CDC1-BR-RT1#show ip route vrf Production 172.28.0.0 Routing Table: Production Routing entry for 172.28.0.0/16, 5 known subnets Attached (4 connections) Variably subnetted with 3 masks B 172.28.0.0/16 [20/0] via 172.28.254.72, 01:43:45 C 172.28.254.68/31 is directly connected, Tunnel3 L 172.28.254.69/32 is directly connected, Tunnel3 C 172.28.254.72/31 is directly connected, Tunnel4 L 172.28.254.73/32 is directly connected, Tunnel4 LHR-CDC1-BR-RT1#show ip route 172.28.0.0 Routing entry for 172.28.0.0/16 Known via "ospf 1", distance 110, metric 242 Tag 65107, type extern 1 Last update from 10.192.0.17 on Port-channel11.1, 01:43:23 ago Routing Descriptor Blocks: * 10.192.0.29, from 10.192.15.19, 01:43:23 ago, via Port-channel12.1 Route metric is 242, traffic share count is 1 Route tag 65107 10.192.0.17, from 10.192.15.19, 01:43:23 ago, via Port-channel11.1 Route metric is 242, traffic share count is 1 Route tag 65107 

And here is what is in the RIB for OSPF on LHR-R1:

Trust - *> 172.28.0.0/16, Ext1, cost 242, fwd cost 241, tag 65107 via 10.192.0.29, Port-channel12.1 via 10.192.0.17, Port-channel11.1 Prod (missing the > as its not installed) - * 172.28.0.0/16, Ext1, cost 500, fwd cost 250, tag 3549 via 10.192.0.129, Port-channel12.3 via 10.192.0.113, Port-channel11.3 

And here is the show ip bgp output for prod vs trust/default:

LHR-CDC1-BR-RT1#show ip bgp vpnv4 vrf Production 172.28.0.0 BGP routing table entry for 64544:2:172.28.0.0/16, version 574931132 Paths: (4 available, best #3, table Production) Not advertised to any peer Refresh Epoch 1 65107, (aggregated by 65107 172.28.251.4), (received-only) 172.28.254.68 (via vrf Production) from 172.28.254.68 (172.28.250.60) Origin IGP, metric 0, localpref 100, valid, external, atomic-aggregate rx pathid: 0, tx pathid: 0 Refresh Epoch 1 65000 65000 65000 65000, (received-only) 172.16.254.80 (via vrf Production) from 172.16.254.80 (172.16.251.6) Origin IGP, metric 30, localpref 100, valid, external rx pathid: 0, tx pathid: 0 Refresh Epoch 1 65107, (aggregated by 65107 172.28.251.3), (received & used) 172.28.254.72 (via vrf Production) from 172.28.254.72 (172.28.250.59) Origin IGP, metric 0, localpref 100, valid, external, atomic-aggregate, best rx pathid: 0, tx pathid: 0x0 Refresh Epoch 1 65000 65000 65000 65000, (received-only) 172.16.254.77 (via vrf Production) from 172.16.254.77 (172.16.251.5) Origin incomplete, metric 30, localpref 100, valid, external rx pathid: 0, tx pathid: 0 LHR-CDC1-BR-RT1#show ip bgp 172.28.0.0 BGP routing table entry for 172.28.0.0/16, version 44800 Paths: (1 available, best #1, table default, RIB-failure(17)) Not advertised to any peer Refresh Epoch 3 3549 64664, (received & used) 10.192.0.34 from 10.192.0.34 (10.192.15.10) Origin IGP, metric 0, localpref 100, valid, internal, best rx pathid: 0, tx pathid: 0x0 

So from that I can see that locally LHR-R1 installed and uses the route to the prod VRF, and then the tunnel directly to ASH. But the trust/default BGP table has the SEA route to ASH.

From LHR-R2 it sees and prefers the ASH>TUK>SEA>LHR path:

LHR-CDC1-BR-RT2#show ip route 172.28.0.0 Routing entry for 172.28.0.0/16 Known via "bgp 64544", distance 20, metric 0 Tag 3549, type external Redistributing via ospf 1 Advertised by ospf 1 metric 250 metric-type 1 subnets Last update from 100.65.0.9 5d07h ago Routing Descriptor Blocks: * 100.65.0.9, from 100.65.0.9, 5d07h ago Route metric is 0, traffic share count is 1 AS Hops 2 Route tag 3549 MPLS label: none LHR-CDC1-BR-RT2#show ip bgp 172.28.0.0 BGP routing table entry for 172.28.0.0/16, version 68163 Paths: (1 available, best #1, table default) Advertised to update-groups: 2 Refresh Epoch 1 3549 64664, (received & used) 100.65.0.9 from 100.65.0.9 (199.76.133.34) Origin IGP, localpref 100, valid, external, best rx pathid: 0, tx pathid: 0x0 LHR-CDC1-BR-RT2# 

and the LHR>ASH route is the best OSPF route:

LHR-CDC1-BR-RT2#show ip ospf 1 rib | inc 172.28.0.0 * 172.28.0.0/16, Ext1, cost 242, fwd cost 241, tag 65107 

But as you can see its the best but not installed. That eBGP route from SEA is winning. And obviously that is because eBGP AD is 20 and OSPF AD is 110. But perhaps my redistribution of OSPF into BGP on LHR-R2 isn't working? Even though its there and the best OSPF route?

I have been working on various turn ups for 20 hours now and cant seem to formulate a winning strategy here. How can I get LHR bgp on LHR-R2 to prefer/install the LHR>ASH route over the LHR>L3>SEA>TUK route?

I know am missing something super simple here but right now my brain is mush. If you need any additional info let me know, any help is appreciated.



No comments:

Post a Comment