Hello all,
I'm in the process of redeploying an old HPC system that's comprised mostly of M1000e chassis with 10Gb Ethernet and QDR IB. The M1000s have Mellanox M3601Q switches in them and I'm wondering what the best way to hook them up is. Traditionally we do 1 IB cable per compute node, but if these actually behave like switches that'd be like having 16 uplinks to a switch, which is around 8 or 16 times as many as we usually run. Should I run only a few (2, 4, or 8) cables to the back of these things or do I run all 16? Previously it had only 8 hooked up to each chassis but seeing the state the cluster was left in I'm not sure that's a good idea.
I'm also wondering if I should hook them directly up to the core or if I should still run "leaf" switches in each rack. My main goal is to ensure that MPI works properly for 96+ nodes and that the I/O from our file system isn't poor as these will mount users' home/work directories and our software stack over NFS.
No comments:
Post a Comment