Tuesday, January 30, 2018

Switch and router telemetry in the modern age

I've been thinking what the future of router/switch monitoring looks like and am looking for enlightened opinions. The classical approach is to poll devices over SNMP and aggregate, but that is not scalable (fine to poll 100 devices every fifteen seconds, but 10000?)

You could invert the model and have devices report interesting flows to some collector, and I see some work in this area dating back to NetFlow, plus new things like the ELK stack and some of the ideas that folks like AT&T are doing (with Apache Avro data serialization) in ONAP.

Of course, at scale the volumes of data would be immense. Maybe that's enough to sink the deal. (Would you need a whole new network just to handle flow reporting? Have you just doubled your traffic?).

I can shake a feeling that polling is just wrong in 2018. But I don't know what the cool kids are doing.

Idle side note: this was prompted by my VoIP supplier showing me one of their tools where every signaling flow from any of their network elements is recorded in a database with a web gui. That's obviously much higher in the stack but a neat idea.



No comments:

Post a Comment