Monday, January 28, 2019

Fortigate firewall appears to not be following standard Aggregate hashing

I am running into a peculiar issue. My setup:

ISP1 -- Cisco9K-1 -\ Fortigate FW cluster ISP2 -- Cisco9K-2 -/ 

9Ks are configured in VPC, there's a VPC port-channel going towards FW cluster. Static route towards FW IP. Firewalls are in Active-Passive pair, each one has LACP LAG towards the 9Ks (connected to both of them). They have static route towards VRRP VIP that lives on a VLAN off 9Ks. 9Ks also have BGP sessions with ISPs, receiving default routes and advertising my blocks.

So far everything is fairly standard and works just fine. Here's when it gets interesting.

Cloudflare's API (api.cloudflare.com) is being advertised via anycast and it just happens that ISP1 prefers to reach it in one of the colo locations, while ISP2 prefers to reach it in another colo location. This by itself should not be an issue, as I would expect that one TCP flow will follow one specific path and one API request would always talk to the same data center.

Except that it doesn't happen. What I am seeing that in significant amount of cases (10%-50%) TCP handshake goes to Cisco9K-1, while following HTTP GET request gets sent to Cisco9K-2. It is ALWAYS the HTTP call that gets routed the wrong way, the SYN and ACK of TCP handshake are ALWAYS going to the same switch. This makes me believe that it's something on the Fortigate firewall that is trying to be smart and do some additional load-balancing based on the L7 (as from TCP standpoint the source/destination IPs and ports are identical). It happens despite of the different agg hashing settings on the port.

Has anybody run into this? Any suggestions on how to fix it? I can definitely do some routing workarounds on the switches, but ideally I'd want to fix it on the offending device - the firewall.

This is a very unique corner case that really shouldn't be affecting many people in the wild and until CloudFlare came into picture, I also didn't notice any issues. But still, would be nice for network devices to follow standard network practices...



No comments:

Post a Comment