Wednesday, September 26, 2018

Data Center Bridging - need some help...

Just to preface this - i've raised calls with both Microsoft and Dell and read the documentation extensively but I don't seem to be getting very far. Hoping someone who has real world experience of this could offer some insight...

We have a Hyper-V cluster supported by a pair of Dell s4048T switches. Each host has 4 NICs, 2 of these will be dedicated to ISCSI so the DCB element is only required on the remaining two which will run all cluster and server traffic. We're looking to use ETS to allocate bandwidth to certain traffic types over the non ISCSI pair of NICS, specifically live migration and cluster, with a default for everything else. My config is below

On each cluster node

Get-NetQOSTrafficClass

Name Algorithm Bandwidth(%) Priority PolicySet IfIndex --------- ------------ -------- --------- ------- ------- [Default] ETS 45 0-2,5-7 Global LiveMigration ETS 50 3 Global Cluster ETS 5 4 Global 

Get-NetQOSPolicy

Name : Cluster Owner : Group Policy (Machine) NetworkProfile : All Precedence : 127 Template : Cluster JobObject : PriorityValue : 4 Name : Default Owner : Group Policy (Machine) NetworkProfile : All Precedence : 127 Template : Default JobObject : PriorityValue : 0 Name : LiveMigration Owner : Group Policy (Machine) NetworkProfile : All Precedence : 127 Template : LiveMigration JobObject : PriorityValue : 3 

Additionally - QOS is disabled for the iscsi adapters and enabled for the remaining.

Switch Config

service-class dynamic dot1p dcb enable dcb-map SET priority-group 1 bandwidth 50 pfc off priority-group 2 bandwidth 45 pfc off priority-group 3 bandwidth 5 pfc off priority-pgid 2 2 2 1 3 2 2 2 

Each interface connecting to the hosts (non iscsi) nics then has the dcb-map assigned.

Testing

The reason I don't think it's working is that when testing, the live migration is saturating the link. For example, if I live migrate 5 VMs and move a large (30GB) file simultaneously, the fire transfer speed drops to a fraction of the link speed until the live migration has finished. If I amend the percentages to be in favour of default traffic with a 95/5 split, the same behaviour occurs.

I feel like i'm misunderstanding something fundamental about how DCB works or how this should be configured, can anyone offer any input?



No comments:

Post a Comment