From the course: NVIDIA Certified Associate AI Infrastructure and Operations (NCA-AIIO) Cert Prep
Unlock this course with a free trial
Join today to access over 25,300 courses taught by industry experts.
NVLink - NVIDIA Tutorial
From the course: NVIDIA Certified Associate AI Infrastructure and Operations (NCA-AIIO) Cert Prep
NVLink
I am sure you know this building or some of you may have visited this building. This is Petronas Tower in Kuala Lumpur. Why I am talking about this? I am talking about this because this sky bridge here. Let us consider a situation where if the sky bridge does not exist. Let us say you are a person at some top floor here into this particular building and you You want to go and meet somebody here. What do you have to do? You have to come down, cross this corridor, then come back here and then go and meet this particular person. But what if we have a direct communication channel like this sky bridge here? So this sky bridge allows you to move from one tower to another without going further down and then coming back up again. is the main advantage of this skybridge. There's a physical connection between these two towers. Why I'm talking about this when I'm talking about GPUs? Consider this is a GPU, GPU 1. And in the same system, you have another GPU also. If they have to communicate…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
-
(Locked)
NVIDIA: Powering AI GPU innovation2m 37s
-
(Locked)
NVIDIA technology stack3m 12s
-
(Locked)
Layer 1: Physical layer3m 53s
-
(Locked)
GPU on a graphics card1m 57s
-
(Locked)
DGX platform2m 56s
-
(Locked)
DGX SuperPOD1m 57s
-
(Locked)
ConnectX1m 49s
-
(Locked)
BlueField DPUs2m 32s
-
(Locked)
NVIDIA reference architectures1m 38s
-
(Locked)
Understanding GPU cores5m
-
(Locked)
Comparing GPU cores4m 18s
-
(Locked)
NVIDIA DGX platform: Timeline4m 47s
-
(Locked)
DGX platform: Deployment options3m 38s
-
(Locked)
DGX A100 vs. H1004m 6s
-
(Locked)
Layer 2: Data movement and I/O acceleration59s
-
(Locked)
NVLink8m 5s
-
(Locked)
InfiniBand2m 5s
-
(Locked)
InfiniBand vs. Ethernet1m 43s
-
(Locked)
DMA and RDMA6m 30s
-
(Locked)
GPUDirect RDMA2m 44s
-
(Locked)
GPUDirect storage1m 45s
-
(Locked)
Quick comparison1m 56s
-
(Locked)
Layer 3: OS, driver, and virtualization2m 17s
-
(Locked)
GPU drivers4m 38s
-
(Locked)
GPU virtualization5m 8s
-
(Locked)
vGPU vs. MIG, part 17m 48s
-
(Locked)
vGPU vs. MIG, part 210m 59s
-
(Locked)
Layer 4: Core libraries6m 44s
-
(Locked)
Compute unified device architecture (CUDA)3m 12s
-
(Locked)
Installing CUDA2m 11s
-
(Locked)
NVIDIA collective communications library (NCCL)3m 41s
-
(Locked)
NVLink, NVSwitch, PCIe, RDMA vs. NCCL3m 44s
-
(Locked)
Layer 5: Monitoring and management2m 23s
-
(Locked)
NVIDIA-SMI4m 24s
-
(Locked)
Data Center GPU Manager (DCGM)7m 27s
-
(Locked)
Base Command Manager5m 33s
-
(Locked)
Which one to use?2m 3s
-
(Locked)
Layer 6: Applications and vertical solutions3m 48s
-
(Locked)
Summary2m 26s
-
(Locked)
NVIDIA AI Enterprise3m 2s
-
(Locked)
NVIDIA AI Factory2m 24s
-
(Locked)
-