Friday, August 23, 2019

Trying to understand how Infiniband works in a network environment.

Hello all,

I am very interested in getting into high performance computing at some point. Right now I'm just a helpdesk tech, but I think with some learning I can make it happen someday.

I am choosing one topic at a time, and something I am having difficulty understanding is Infiniband and how it works in a network. I understand that it is a 100gig networking standard used to interconnect devices. What I don't understand however is how it is used. Is it just used to interconnect devices on a cluster for instance, like between nodes, so that data throughput between the nodes is faster and you can take advantage of faster reads/writes and not get bottlenecked by the network connection? Boy this is going to sound embarrassingly novice here: but if not, then how would you run 100gig connection from the demarc to the Infiniband switch and have that work effectively?

For example, say I have a 100gig Mellanox Infiniband switch with 18 ports. I have all 18 ports hooked up to 18 different nodes. I would then have to share 100gig divided by 18 ports, right? That reduces the throughput to effectively 5.5gig then. It would be limited by the outside connection to the building so this must surely be just a protocol used for clusters and interconnected devices within the network right?

Why not just use 100gig fiber? Infiniband has virtually no packet loss I was reading, but is this really worth it?

I much appreciate any pointers on this topic, thank you!



No comments:

Post a Comment