Apologies for the quick/dirty diagram, it was quicker and easier to create this vs try to redact existing document and post that.
Image is located here, https://i.imgur.com/7bI7KfP.png
First, here is a quick explanation of this topology.
The router/fw's are directly connected with a crossover cable and they talk to e/o on a High Availability interface configured in the firewalls. The connection from the FWs to the ids/ips device...we can call this interface 1 and it handles a few VLANs. Meaning, all the links in the diagram are VLAN trunk ports. I don't think my issue exists at this level, so I'll move on.
The next device is an IDS/IPS that is configured by a vendor, all we are required to do is cable the device into our network. They are aware of the topology and have told us which interfaces are in and which interfaces are out. Also, this device is passive and all traffic passes, but not inspected, if the device has a failure. I don't think my issue is here, but more on that, below.
The fiber switches don't have redundant power supplies (part of the issue) and are directly connected to e/o with the link being configured as a port channel, in this case, port channel 1. From there, I have redundant fiber links to the buildings in our environment.
Here is the issue I am running in to. Today there was a power blip and all of our equipment is connected to UPS units, but the UPS unit that fiber switch 2 is plugged in to either has a bad battery or some other issue caused the UPS to fully cycle (no battery power). That is another issue, but that's what caused me to create this post. While fiber switch 2 was rebooting, there was no network connectivity for all users/devices connected to buildings/switches 1-4. I am running STP (mstp) and I assumed that when fiber switch 2 dropped offline that traffic should flip to fiber switch 1. Fiber switch one is set to 8192 and fiber switch 2 is set to 12288. It seems to me that fiber switch 2 was running as the main switch since all traffic stopped when it was rebooting. Once it fully rebooted, everything was back to normal.
Is it possible that STP was reconverging during the reboot of fiber switch 2 and that was the cause of the delay? I can't say for certain that the delay was the exact time of the switch reboot, but it was fairly close. It has been a while, but I feel confident that I tested this scenario prior to this setup being in production and when I pulled the power cable from either fiber switch I only dropped a few pings to remote devices and the turn around time was less than 10 seconds. When I was testing this before it was put into production, the IDS/IPS box was NOT in play. We had not contracted with this company, at that time, and there was no way for me to test with an IDS/IPS since we didn't have one. When I did my testing, the topology was the same as what you see above except that the firewalls plugged directly into fiber switch 1 and fiber switch 2, respectively.
Also, yes, I should probably be routing here, but this network/interface is just a section of our network (a small one) and there are other projects that are being worked on. There has been talk to move these links over to the routed portion of our network, but other things need to be done prior to that happening.
Thanks.
No comments:
Post a Comment