Saturday, January 23, 2021

BGP Traffic sharing caused major issue... any ideas?

Firstly, network diagram with description of issue: https://i.imgur.com/So5caCh.png (we're advertising our aggregate netblock to both ISPs which I didn't mention in diagram)

Hi All - turned up our redundant transit connection today with the above topology, which I thought was fairly standard, but immediately started getting tons of customers complaining about being unable to get out to the internet - intermittently. Had to turn off the redundant transit connection until I resolve the issue. Now, I think our iBGP routes may be the cause... I'm not sure how, but my first plan to test the issue is to lower the import preference so the iBGP routes are only selected as a last resort.

What I saw when troubleshooting, is I used a site called "ping.pe" to test reaching customers from various differing locations. What I was seeing, is that pings from some locations would work absolutely fine, but fail from others. Traffic I could see was ingressing from ISP A, and work to some return destinations but not others. The traceroute ping.pe provided proved that it was reaching us, it just was just failing when it reached us... which means that obviously it must have been failing in the return direction....

Just can't for the life of me figure out an actual definitive reason as to why it would work for some and not others :)... anyone had experience with a similar situation? Am I misunderstanding how this should be configured? Also I know there will be some asymmetric routing, but this shouldn't be an issue.



No comments:

Post a Comment