Monday, February 24, 2020

Cisco Trunk/Access Ports and Spanning-tree

tldr: I had a trunk port connected to a switchport for 4 days before it erred out and then after it erred out due to spanning tree, I still had link. Why would this work for so long?

Had a weird issue at the office today. To give some background, my company recently moved into office space that is shared with our data center. This data center has a 100G fiber ring around the town, and we have several VLANs on this ring that runs to our rack, one for each of our customers that is on the ring, allowing us to get traffic from their network to ours without touching the internet. When we moved into this office space, we worked with them to split off another VLAN for our offices, so that we could access our own fiber ring around the city. The way it was set up is that the Ethernet jacks in our offices would be access ports on the data center’s switch on VLAN X, and then that VLAN would be troubled back through their switch stack to the lag or port channel that goes to our rack. When we moved in last Thursday, we configured a Cisco 3750 POE (just what we had laying around) with the uplink port as a trunk port, and the remaining ports as access for whatever equipment we would need to plug in. All day Thursday and Friday there were no issues with this set up, and up until around 9:30 or so this morning, everything was fine. Then, we lost connectivity to our rack. Opened a ticket with the data center and spent a couple hours troubleshooting my own equipment. After I concluded that the issue was not to do with my own equipment because I had link on every port that was connected to something, I sat waiting for the data center to answer the ticket. Around 3pm or so, they finally got back to us saying that they found that their port was erred out because of a spanning tree problem and that the problem was on our network. My first thought was, we don’t have any loops on our network, none of our switches threw a spanning tree error, and if their port is erred out, shouldn’t we not have link? Then they emailed over the actual error messages from their switch, and it shed some more light:

*Feb 24 08:53:16: %SPANTREE-SP-7-RECV_1Q_NON_TRUNK: Received 802.1Q BPDU on non trunk FastEthernetA/Z VLANX *Feb 24 08:53:16: %SPANTREE-SP-7-BLOCK_PORT_TYPE: Blocking FastEthernetA/Z on VLANX. Inconsistent port type.

So now I realize that the issue is because their ports are configured as access and ours are configured as trunk. Personally, I don’t have the technical knowledge to understand why this makes a difference to spanning tree, but I understand enough to see that this is the problem. Once changing our switch uplink ports to access and essentially turning our 3750s into dumb switches, we got connectivity back. Here’s what I’m having trouble understanding: why did this work just fine for 4 or so days and why did I still have link on my ports if the other end of the link was erred out?



No comments:

Post a Comment