Friday, February 22, 2019

Calculating physical and logical failure risks within Cisco Nexus Leaf/Spine

Trying to quantify the network side probability (five nines, ten nines) of a catastrophic event disrupting the entire domain. For HW I’m thinking zero/infinite (no single failure results in any loss of service - worst is reduction in BW available), but for SW I can think of several theoretical protocol/table events that could occur. Looking to capture the potential of hitting a bug or erred config and the RPO to restore). The opposing side of the question is the risk reduction with a second or N leaf/spines isolated across a backbone or disparate DCs. Even getting to ‘unknown’ if it happens, but when event disrupts it takes 40 minutes to restore over four spines and eight leafs (or whatever the math scales to) would be helpful.

Know this is a near impossible hard data question, but hoping someone has done enough math to quantify portions.

Thanks.



No comments:

Post a Comment