Tuesday, January 9, 2018

What are some authoritative resources on proper network capacity planning and congestion mitigation?

Some specific questions I'm looking to answer as comprehensively as possible:

  1. During the design phase, how do I know how much bandwidth is really "enough" based on the business requirements of the network? I understand the sweet spot now is 1Gbps to the endpoint, 10Gbps to access layer, and 40/100Gbps at the collapsed core.
  2. In production, how do I know if current bandwidth is reaching or exceeding capacity? Put another way, what OIDs and thresholds do I set my NMS to alert to, which I can then point to and say, "it's time to upgrade"? I understand monitoring an interface's aggregate throughput alone is not sufficient; I need to also look at output drops for microbursts. Anything else?
  3. When links have finally reached capacity, what is the proper order of mitigation, assuming I can't just drop in higher bandwidth right away? My understanding:
    1. L2/L3 link aggregation (more parallel pipes), then
    2. QoS (choose what traffic to keep/drop), then
    3. Deepen buffers (add latency--only for latency-insensitive applications), then
    4. Hardware upgrade becomes mandatory


No comments:

Post a Comment