Wednesday, June 19, 2019

Tracking Down High Latency

Hello, I'm working alongside other people on this very specific issue of high latency.

This is a point to point EoSDH circuit connecting two states over 4 thousand kilometers owned by a local carrier, we have worked very closely with them but in the end we could not pin point the cause of it, the equipment had no logs or errors. (according to them).

This is a graph of the delay from A to B:

https://imgur.com/a/6woeWOj

few facts we know and things we tried:

1 - This is a 20Mbps line composed by 10 x 2Mbps virtual containers

2 - The normal latency values for line is around 60ms

3 - Latency peaks are pretty much never higher than 600ms.

4 - CPE (C3925) shows no output drops or other errors

5 - There are no packet loss even when latency is high

6 - Latency is observed from CPE to PE

7 - Engaged TAC to make sure CPE is good, no issues found.

8 - Took pcaps from a WPA at LAN side in an attempt to see anything out of ordinary, nothing.

9 - Using test sets like Fluke, JDSU did not show errors on the line and the latency showed on the test set was OK.

We stopped thinking it could be related to the carrier network when during an outage on the main link, we observed the same latency spike on the backup link that is a different carrier with pretty much a whole different topology.

So we are now more inclined thinking this could be caused by an application or a behavior from one equipment that we so far failed to see.

I'd like to have a couple ideas on what do you guys usually do to track down the source of a latency issue.

I know this post sounds a bit confusing and vague, I'm sorry for that but I hope I can get some suggestions on how to proceed.

Thanks.



No comments:

Post a Comment