Thursday, June 28, 2018

Error Rate Monitoring

I'm trying to figure out a way to monitor discards and errors on our Juniper routers using SNMP IF-MIB. I notice that there is OIDs for ifInDiscards, ifInErrors, and ifOutDiscards and ifOutErrors but how can I turn that data into an error rate or percentage of total packets that have errors. I don't want to send alerts on the very existence of an error or discard on a port especially if it's 1 error over the course of 5 million packets or 30 days or something. Just looking to monitor the percentage of packets having errors (ie 1% or higher)

If it helps we are using a modified TICK stack, we use Grafana for graphing, Influx for storage, Prometheus for alerting, and Telegraf for collection.

Thoughts? Thanks guys.



No comments:

Post a Comment