Thursday, November 5, 2020

Juniper EX3400 stops a network loop, but still causes network to crash. Was able to fix, but want advice please.

Hi everyone! Tuesday morning we had our entire network go down. It took me about 3 hours but I was finally able to trace it to a single patch cable which then led me to the damn network loop that caused the outage. Someone plugged a small switch in their cube and then the other end went back to the wall...

Anyways, I thought STP was supposed to prevent this? I looked on my switch and found storm control was enabled on the ports. It looks like it did its job because I would get intermittent pings etc. and all the switches weren't completely locked up like I have experienced in the past with loops. The issue is though that even though it SEEMED to stop the loop, we still experienced about 50% packet loss throughout the company which really screwed up almost all connections etc. except for the most basic like web surfing which just slowed to a crawl.

I did some brief research but wasn't able to find anything concrete. I apologize if this is a dumb question, but isn't the entire point of storm control or STP or whatever to prevent this type of issue? Is there more to the configuration than I thought and maybe I just need to enable some more advanced features? I was hoping it would shut down the port completely and save us but I'm not sure if switches are smart enough to know where the loop is coming from.

Any insight is appreciated, I'm curious to see how professionals handle network loops in their environment or I would just love to hear your "fun" story! I swear about 2 hours into it I was about to have a panic attack but eventually I just started unplugging switches until I realized the issue. I am just a lonely sysadmin so I'm not a network guru but I'm in charge of the whole company's infrastructure. Its about ~175 users and like 8 Junipers scattered throughout the building.

Thanks!!



No comments:

Post a Comment