From the course: NVIDIA Certified Associate AI Infrastructure and Operations (NCA-AIIO) Cert Prep
Unlock this course with a free trial
Join today to access over 25,300 courses taught by industry experts.
Network fabric - NVIDIA Tutorial
From the course: NVIDIA Certified Associate AI Infrastructure and Operations (NCA-AIIO) Cert Prep
Network fabric
So now you know four type of network fabric. Let's perform a comparison between these fabrics to understand about it better. What we are going to compare, we are going to compare purpose of each of these network fabric, how it is implemented physically or logically, key design features and considerations on this. Expect some question in your exam on these topics. So when it comes to compute network, It is primarily designed for GPU-to-GPU communication within the node or across the nodes also. It is a backbone for training and inferences jobs. How it is implemented? It is implemented through InfiniBand, ROCE, or NVLink Fabrics. We'll discuss all these details. Don't worry about it. The idea is it is a high bandwidth interconnect between compute node. We want to ensure that the communication can happen as fast as possible, so it is high bandwidth interconnect between these nodes. When it is implemented, it should be having extremely high throughput and ultra low latency. It must scale…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
(Locked)
Inside an AI-centric data center4m 4s
-
(Locked)
Power usage effectiveness (PUE)5m 4s
-
(Locked)
The compute power4m 24s
-
(Locked)
CPU and GPU7m 11s
-
(Locked)
CPU vs. GPU: Architectural difference2m 48s
-
(Locked)
Beyond Moore's Law2m 35s
-
(Locked)
Data processing unit (DPU)6m 44s
-
(Locked)
Network inside an AI-centric data center4m 50s
-
(Locked)
Network fabric4m 43s
-
(Locked)
Ethernet vs. InfiniBand7m 5s
-
(Locked)
Converged Ethernet (CE)2m 31s
-
(Locked)
Storage inside an AI-centric data center4m 21s
-
(Locked)
Cloud vs. on-prem4m 11s
-
(Locked)
-
-