Friday, May 15, 2020

[Question] What could cause multiple retranmissions of the same packet?

We have a customer who is complaining about slowness/timeouts when accessing application. After doing the usual checks, the application did not seem to be at fault. The cusotmer has multiple servers, with all of them talking to each other, running multiple applications. These are virtual servers on ESX.

So we asked for tcpdumps of the network, and found a few things - zero-window packets, duplicates, keep-alives, etc. I believe those are normal in busy networks.
But there was something interesting - there were multiple retranmissions, which continued until a timeout.

Now I do understand some retranmissions are normal, but these have a pattern:

  1. The retranmission is always for a SYN request
  2. It keeps retransmitting until the maximum RTO is reached - so there's usually 6 packets for one request.
  3. This happens for all applications, but appears to be only for https requests.
  4. The receiving server does not receive even one of the SYN requests. (We had asked them to capture packets simultaneously on multiple servers)
  5. There's no reset packet being sent for any of these
  6. The behaviour isn't seen for communication between the virtual servers on the same ESX host. It is only there when the packet contacts servers on another host. And even for the servers on the other host, the behaviour is identical - they can communicate fine with each other, but have these retranmissions when communicating outside the host.

It appears that whenever a connection is initiated by a server, it is being throttled/blocked or being silently dropped. And this isn't true for all requests - just for some of them.

I do not have much experience with troubleshooting networks, so wanted an opinion from the experts here. Is my conclusion correct, i.e., there is something throttling these requests? If so, have any of you faced this before, and would know what would cause it?

If I'm wrong, please let me know as well - I would like to learn more.

Apologies, I do not have details of the network architecture - all I know is that the servers on on ESX. But let me know if any more information is needed, and I may be able to help.



No comments:

Post a Comment