Wednesday, May 22, 2019

ISP Gateway providing ARP replies for our IPspace with differing VRRP Mac addresses?

Greetings Everyone!

Apologies for the length of the post, as I'm trying to provide as much context and documentation as I can. Trying to wrap my head around an issue we're having here. We have dual firewalls in a HA failover config. each Firewall has a physical IP and several Virtual IPs configured for High Availability VRRP when the firewalls are failed over. the VRRP Virtual Mac addresses all start with 00:00:5e:00:01:0-VHID. Depending on the response, there are stretches of time where we lose connectivity - somtimes after a couple of hours, sometimes after a few days ; I'm convinced as a result of some sort of arp cache issues. The MAC addresses can be traced to our systems, so it's not a matter of dupicate IPs in their extended network since we share a subnet with their other customers. I found no rogue or unidentifiable Mac addresses, it's just that that sometimes THE ISP gateway responds with the Physical Interface MAC and sometimes with the VRRP Virtual MAC. We maintain 2 other HA Firewall Configs with differing IPspace and ISPs that have the same type of configs, both of those have been trouble free. This ISP is the only one that has been causing an issue, and it seems more frequent as time goes on.

I welcome any insight at this point. I feel like I'm going insane.

My Question:

  • Their gateway literally responds like a know-it-all grammar school kid to EVERY single arp request, answering on behalf of both of our physical IPs and virtual IPs. Sometimes the MAC addresses are the same, sometimes they differ. Sometimes it advertises the Physical MAC address and sometimes it advertises the VRRP Virtual MAC address (in the case of the Virtual IPs). For each ARP request, I receive 2 ARP replies, 1 from our system that's the target of the arping, and 1 from their gateway. Is this normal behavior that should be expected? Literally it answers for EVERYTHING associated with our IPspace.

As an example, here's the ARP table on our standby firewall with the primary firewall as active. I sent the below output (as well as traceroutes etc) to their tech team, and I got the verbal equivalent of eyes glazing over. Their level 3 support defaulted to "reboot the modem" which is something we've done at least once a week each time anyway. Rebooting the modem brought a couple of the Virtual IPs back, others remain an issue as noted below.

(ISP_GATEWAY) at (ISP_GW_MAC) on bge4 expires in 1173 seconds [ethernet]

(VIRTUAL_IP1) at (FW1_PHYS_MAC) on bge4 expires in 388 seconds [ethernet]

(PHYSICAL_FW2) at (FW2_PHYS_MAC) on bge4 permanent [ethernet]

(VIRTUAL_IP2) at (VIP2_VIRT_MAC) on bge4 expires in 1193 seconds [ethernet]

(VIRTUAL_IP3) at (VIP3_VIRTUAL_MAC) on bge4 expires in 1183 seconds [ethernet]

(VIRTUAL_IP4) at (FW1_PHYS_MAC) on bge4 expires in 1170 seconds [ethernet]

  • arping output and associated tcpdump. In the below case, it's responding with the physical MAC address of our primary firewall, while the primary firewall is responding with the Virtual VRRP MAC.

ARPING VIRT_IP_3

60 bytes from FW1_PHYS_MAC (VIRT_IP_3): index=0 time=165.729 usec

60 bytes from ISP_GW_MAC (VIRT_IP_3): index=1 time=10.383 msec

60 bytes from FW1_PHYS_MAC (VIRT_IP_3): index=2 time=183.767 usec

60 bytes from ISP_GW_MAC (VIRT_IP_3): index=3 time=12.337 msec

60 bytes from FW1_PHYS_MAC (VIRT_IP_3): index=4 time=181.841 usec

60 bytes from ISP_GW_MAC (VIRT_IP_3): index=5 time=104.296 msec

10:35:53.667220 ARP, Request who-has VIRT_IP_3 tell FW2_PHYS_IP, length 44

10:35:53.667385 ARP, Reply VIRT_IP_3 is-at VIP3_VIRT_MAC (oui IANA), length 46

10:35:53.677589 ARP, Reply VIRT_IP_3 is-at FW1_PHYS_MAC (oui Unknown), length 46

10:35:54.667351 ARP, Request who-has VIRT_IP_3 tell FW2_PHYS_IP, length 44

10:35:54.667516 ARP, Reply VIRT_IP_3 is-at VIP3_VIRT_MAC (oui IANA), length 46

10:35:54.679669 ARP, Reply VIRT_IP_3 is-at FW1_PHYS_MAC (oui Unknown), length 46

10:35:55.669868 ARP, Request who-has VIRT_IP_3 tell FW2_PHYS_IP, length 44

10:35:55.670034 ARP, Reply VIRT_IP_3 is-at VIP3_VIRT_MAC (oui IANA), length 46

10:35:55.774143 ARP, Reply VIRT_IP_3 is-at FW1_PHYS_MAC (oui Unknown), length 46

  • Here's another arping output for a different virtual IP. This time, the ISP Gateway MAC is responding with the Virtual MAC used for VRRP, which is identical to the local system (primary firewall) response. This particular IP address is pingable from the outside, but can't traceroute beyond the gateway. It hits their gateway and then times out beyond that.

ARPING VIRT_IP_2

60 bytes from FW1_PHYS_MAC (VIRT_IP_2): index=0 time=218.118 usec

60 bytes from ISP_GW_MAC (VIRT_IP_2): index=1 time=9.392 msec

60 bytes from FW1_PHYS_MAC (VIRT_IP_2): index=2 time=185.204 usec

60 bytes from ISP_GW_MAC (VIRT_IP_2): index=3 time=11.139 msec

60 bytes from FW1_PHYS_MAC (VIRT_IP_2): index=4 time=136.903 usec

60 bytes from ISP_GW_MAC (VIRT_IP_2): index=5 time=124.488 msec

10:48:02.686125 ARP, Request who-has VIRT_IP_2 tell FW2_PHYS_IP, length 44

10:48:02.686345 ARP, Reply VIRT_IP_2 is-at VIP2_VIRT_MAC (oui IANA), length 46

10:48:02.695503 ARP, Reply VIRT_IP_2 is-at VIP2_VIRT_MAC (oui IANA), length 46

10:48:03.687297 ARP, Request who-has VIRT_IP_2 tell FW2_PHYS_IP, length 44

10:48:03.687462 ARP, Reply VIRT_IP_2 is-at VIP2_VIRT_MAC (oui IANA), length 46

10:48:03.698416 ARP, Reply VIRT_IP_2 is-at VIP2_VIRT_MAC (oui IANA), length 46

10:48:04.688300 ARP, Request who-has VIRT_IP_2 tell FW2_PHYS_IP, length 44

10:48:04.688423 ARP, Reply VIRT_IP_2 is-at VIP2_VIRT_MAC (oui IANA), length 46

10:48:04.812769 ARP, Reply VIRT_IP_2 is-at VIP2_VIRT_MAC (oui IANA), length 46



No comments:

Post a Comment