From the course: NVIDIA Certified Associate AI Infrastructure and Operations (NCA-AIIO) Cert Prep

Unlock this course with a free trial

Join today to access over 25,300 courses taught by industry experts.

NVLink, NVSwitch, PCIe, RDMA vs. NCCL

NVLink, NVSwitch, PCIe, RDMA vs. NCCL

So these two components or technologies look same, but they are different in the way they operate So NVLink, NVSwitch, PCIe, RDMA versus NCCL All these technologies NVLink, NVSwitch, PCIe or RDMA is made-up of hardware and drivers Whereas NCCL is a software library like you use any other libraries in Python Think of it as a high-speed expressway, which is NVLink, NVSwitch, PCIe or RDMA. So these are physical components which has a high-speed expressway, whereas traffic management is done by NCCL. What speed, what type of communication, which road to take, where to stop, all is taken care by NCCL. So NCCL is your traffic management system. Hope this analogy brings home the point. The focus of these physical and driver technology is on how to transfer data fast, whereas NCCL focus on how to organize many transfer efficiently. If the same data need to be sent to 100 GPU, it may not be very efficient if you start doing communication one-to-one. That is where NCCL would identify what is…

Contents