Tuesday, April 13, 2021

Iperf Results - Host to host testing

I'm having some strange Iperf results, putting this out there in case someone may have an explanation.

1) 2 IPerf VMs on the same host:
TCP: ~20Gbps throughput for single/parallel -p 10.
UDP:~3Gbps of throughput for single/parallel -p 10 (1460 length packets, unlimited bandwidth)

2) 2 IPerf VMs on different hosts (Host A, Host B):
TCP: ~7Gbps throughput for single/parallel -p 10.
UDP: ~3Gbps of throughput for single/parallel -p 10

3) 1 IPerf VM on Host A connecting to 2 IPerf VMs on host B:
TCP: ~7Gbps throughput for single/parallel to each VM on host B (aggregate throughput is ~14Gbps from VM on Host A).
UDP: ~6Gbps of throughput for UDP testing to each VM on Host B (aggregate throughput is ~6Gbps).

4) 1 IPerf VM on Host A connecting to 3 IPerf VMs on host B:
TCP: Aggregate throughput is ~14 Gbps split between the connections to 3 VMs (7 Gbps to one, 3-4Gbps to the other two).
UDP: ~9Gbps Aggregate (~3Gbps to each VM)

Setup:
- VMs are running on ESXi/Hyperflex hosts
- Each host has 2x25Gbps uplinks to TORs (UCS FIs).
- VMs are on same L2 Portgroup
- VDS port group load-balancing is set to balance based on virtual port ID
- No bandwidth Reservation/limiting in ESXi
- No other VMs are utilising the host uplinks during the testing. UCS has 200G uplinks which are <5% utilised during the testing. Viewing the host uplink port stats in test 3 the input/output link to Host A is reaching 14Gbps as expected based on the test result, but never any higher for say test 4.
- All VMs running Iperf3 version 3.9-1 on Ubuntu LTS 20.04 with 4 vCPU and 8GB RAM assigned.

Issues:
- The difference between single and parallel TCP tests is negligible. Each parallel stream decreases until the aggregate speed is the same as a single TCP test.
- UDP testing between 2 VMs on the same host is still slow. This looks likely to be a combination of the VM/Iperf issue.
- A significant difference in aggregate throughput from the tests 2 to 3. This would indicate the CPU core IPerf is running on is maxing out however CPU utilisation is low during the testing.
- I would expect to get at least 10Gbps of throughput on single/parallel TCP testing between two VMs on different hosts.

Additional testing:
- No difference in results if LRO is turned off or on at the VM level
- Minimal difference in results using IPerf2
- No difference in results changing the IPerf traffic direction -R
- Checked for dropped packets etc. on the Server/FIs/Northbound switches
- Tested using different hosts in the HX cluster with similar results.



No comments:

Post a Comment