I work as a network engineer for a regional ISP. We use Solarwinds NPM for our network monitoring system. One disappointing limitation with this system is the inability to build alerts on anything other than a static threshold. For example:
If an interface with a description containing "Backbone" is at 80% utilization, send alert.
For obvious reasons, this sort of static alerting threshold doesn't scale well. I would love to implement an alerting system that uses something like a standard deviation from an expected utilization level, or some other method for anomaly detection.
I'm curious - what have been other peoples' experiences with using static thresholds? How have you grown beyond using them? What tools do you use for this purpose?
No comments:
Post a Comment