Friday, June 28, 2019

Spanning Tree, LACP, and possible EIGRP reconvergence issues I am unable to track down

So this last week or two I have been running into some really strange issues in our environment.

We have been upgrading the code on our IE3000 switches. The method is this: One of our employees fires up his LAB IE3000 with updated code, matches the config to the production IE3000, then replaces the flashcard. This has been a fairly standard process for a while, but recently whenever one of these IE3000 switches is rebooted I see the following log in the distribution switch:

__________________________________________________________________________

Jun 27 2019 08:46:56.954 PDT: %LACP-SW1-4-MULTIPLE_NEIGHBORS: Multiple neighbors detected on Gi1/8/35: new neighbor(sys-mac-id: ****.****.8800, port: 0x102), old neighbor(sys-mac-id: ****.****.8800, port: 0x103)Jun 27 2019 08:46:56.958 PDT: %LACP-SW1-4-MULTIPLE_NEIGHBORS: Multiple neighbors detected on Gi2/8/35: new neighbor(sys-mac-id: ****.****.8800, port: 0x103), old neighbor(sys-mac-id: ****.****.8800, port: 0x102)

__________________________________________________________________________

Now the next logs below are a different day, but checking syslogs and talking to the employees we had some IE3000s upgraded this morning, shortly after those IE3000 switches reboot EIGRP reconverged causing issues with our network. I have a feeling this is related I am just too new to understand the underlying issue.

__________________________________________________________________________

Jun 28 2019 06:19:00.355 PDT: %SYS-SW1-3-CPUHOG: Task is running for (2000)msecs, more than (2000)msecs (78/74),process = LTL MGR.-Traceback= 0x99C03DCz 0x99B3390z 0x97DB034z 0x97D5690z 0x97C19E0z 0x9AB276Cz 0x9AB2A48z 0x9AB4CE4z 0x9AB6BE8z 0x9AB7044z 0x9AB7D14z 0x9AB904Cz 0x5124844z 0x51382A4z 0x5144AA8z 0x5499280z

Jun 28 2019 06:32:29.731 PDT: %SYS-SW1-3-CPUHOG: Task is running for (2000)msecs, more than (2000)msecs (3/3),process = LTL MGR.-Traceback= 0x97D58F0z 0x97D5928z 0x9AB2800z 0x9AB2A48z 0x9AB2D5Cz 0x9AB7E38z 0x9AB904Cz 0x5124844z 0x51382A4z 0x5144AA8z 0x5499280z 0x5493654z

Jun 28 2019 06:36:35.733 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 100: Neighbor ******* (Vlan2301) is down: holding time expired

Jun 28 2019 06:36:36.861 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 210: Neighbor ******* (Vlan3322) is down: holding time expired

Jun 28 2019 06:36:39.637 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 220: Neighbor ******* (Vlan3311) is down: holding time expired

Jun 28 2019 06:36:39.637 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 220: Neighbor ******* (Vlan3312) is down: holding time expired

Jun 28 2019 06:36:39.637 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 100: Neighbor ******* (Vlan2302) is down: holding time expired

Jun 28 2019 06:36:40.185 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 100: Neighbor ******* (Vlan2301) is up: new adjacency

Jun 28 2019 06:36:40.881 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 210: Neighbor ******* (Vlan3321) is down: holding time expired

Jun 28 2019 06:36:41.377 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 210: Neighbor ******* (Vlan3322) is up: new adjacency

Jun 28 2019 06:36:42.457 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 100: Neighbor ******* (Vlan2302) is up: new adjacency

Jun 28 2019 06:36:42.693 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 210: Neighbor ******* (Vlan3321) is up: new adjacency

Jun 28 2019 06:36:42.853 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 220: Neighbor ******* (Vlan3312) is up: new adjacency

Jun 28 2019 06:36:43.053 PDT: %DUAL-SW1-5-NBRCHANGE: EIGRP-IPv4 220: Neighbor ******* (Vlan3311) is up: new adjacency

__________________________________________________________________________

Lightweight topology:•Nexus 7K Core --> VSS Catalyst 6509 Distribution --> IE 3000 Access

•Redundant links running from the VSS paired 6509 to the IE 3000

•We use LACP for the etherchannel bundles.

6509 Distribution Port Channel Config:

interface Port-channel180

switchportswitchport mode trunk

!

interface GigabitEthernet2/3/32

switchport

switchport mode trunk

channel-protocol lacp

channel-group 180 mode active

!

interface GigabitEthernet1/3/32

switchport

switchport mode trunk

channel-protocol lacp

channel-group 180 mode active

IE3000 Port Channel Config:

interface Port-channel1

switchport mode trunk

!

interface GigabitEthernet1/1

switchport mode trunk

channel-protocol lacp

channel-group 1 mode active

!

interface GigabitEthernet1/2

switchport mode trunk

channel-protocol lacp

channel-group 1 mode active

I have never seen an IE3000 cause an EIGRP reconvergence in my very long 2 years as a network engineer. I am leaning towards a spanning tree issue. Unfortunately, STP is something I have not really had to deal with yet. So I am having a fun time running the spanning-tree summary and detail commands with a blank face.

edit: forgot how to format.



No comments:

Post a Comment