Thursday, March 26, 2020

Weird SNMP behavior, wondering if it's specific to 3850s or IOS more broadly

Set up some new 3850s recently, one running 16.3.7 and the other running 16.6.7 (hoping to get that first one upgraded to Everest soon). Hooked up their management interfaces (which are in their own VRFs) and added a default route in that VRF, but did not add a default route to the global routing table (it's solely an access switch so I hadn't bothered to add any routing statements beyond getting the management port up).

After setting them up I tried adding them to our monitoring systems and found that while I could SSH to the management interface just fine, I couldn't get SNMP data when querying the management interface. We've got plenty of other devices set up like this (monitoring/management on a separate interface) and SNMP queries to them works just fine; we even had a few other 3850s set up like this that were working just fine, running the same version of code, so it was a bit of a headscratcher.

Eventually I figured out that the 3850s on which SNMP was working properly had default routes in their global routing tables. Went ahead and set one on one of the misbehaving switches, and SNMP queries started working immediately. As far as I can tell, it's responding from the management interface; I don't see return traffic coming back from a non-management IP interface on the 3850. It's like under the hood, it's doing a check of "I received an SNMP query from xyz. Do I have a route (in the global table) to xyz? If yes, did I receive this on an interface that's in Mgmt-intf? If yes, do I have a route (in VRF Mgmt-intf) to xyz?" Without a route in the global table, it never gets to that second step and just times out.

Anyone seen anything like this?



No comments:

Post a Comment