Friday, May 24, 2019

How much TCP retransmission / interface discards are OK?

We're trying to troubleshoot server guys' storage latency issues (VMs on NFS), and I'm seeing some discards on our DCI link. It's 6x10Gbps, servers are connected also with 10Gbps NICs. How much discards would you consider OK? I'm thinking it's OK to have some, as two flows might get hashed to same DCI link and max out that 10Gbps causing some drops. (microbursts)

Server guys are also telling me that they're seeing TCP retransmissions on their servers. Probably because of the discards. What you would consider normal amount? They're of course telling me that even one is a bad thing :)

We have some some VM datastores that are replicated between DCs, and the write isn't acknowledged by the local server before the remote server also ACKs it. So if we'd have lot's of latency in the DCI, it would cause latency to the local VM too. DCs are separated by 5 miles or so.

This too is a case to prove it's not the network. But thank you in advance for any ideas :)



No comments:

Post a Comment