Tuesday, January 9, 2018

PBR Route-Map ACL and Punts 6500-E/Sup720

Greetings all,

I'm working on some routing changes in our core and I noticed something odd after the last round of updates. After removing a device out of the Layer 3 equation and just passing traffic through it via Layer 2 I noticed increased CPU usage on our other core 6500-E devices, upwards of 20% sustained increases.

Looking at the show proc cpu sorted output it looks like it is interrupts. A "show tcam int vlan [id] acl in ip" on any of the vlans associated with the main route-map we're using shows a "punt ip any any" at the end. This is a Permit route-map with an ACL that has all denies (to drop latency sensitive traffic like DHCP, DNS, VoIP, etc out of the PBR) and a "permit ip any any" at the end to match everything else.

I'm confused on why it is punting. This route-map was in use before our maintenance but we didn't see the cpu spike until after. I did change the set command though and I'm afraid that may be what did it.

We've got active/active firewalls and I don't believe I can set the same IP address on each of them for the subinterface that connects to the vlan that our core devices use to talk to it (they all have an SVI on that vlan that we use PBR to push traffic to the firewalls so they look like they're Layer 2 adjacent). Due to that, we set up a loopback with the same IP on each firewall to use as a VIP and this gets advertised out OSPF (route table shows it "via" the two subinterfaces previously mentioned). I'm attempting to use "set ip next-hop recursive" on the route-map to push traffic to that VIP at which point the firewalls will process it. I'm wonder if changing it to recursive isn't what is causing the punt.

I've been looking this up and it appears there should be hardware support for IPv4 recursive next-hop for load sharing.
http://ift.tt/2AKfYVm

The documents and examples I've found online all talk about load sharing but then configure the route-maps with one or more "set ip next-hop" in addition to the recursive one which confuses me since it is my understanding that "set ip next-hop" entries take precedent over the recursive version... at that point it wouldn't be load sharing.

Any thoughts (besides PBR sucks... I know and I can't wait until we have the hardware/time to do something else for traffic/network segmentation)?

Thanks!



No comments:

Post a Comment