From the course: AI Workshop: Advanced Chatbot Development
Unlock this course with a free trial
Join today to access over 25,300 courses taught by industry experts.
Principles of model pruning
From the course: AI Workshop: Advanced Chatbot Development
Principles of model pruning
- [Instructor] Welcome back. In this segment, we'll explore model pruning, a key technique for making large language models more efficient. Think of it like removing unnecessary weight from an F1 car to improve its speed and performance. Model pruning involves reducing the size of a neural network by removing less important parameters. This helps decrease the model's memory footprint, and computational requirements, making it faster and more efficient without significantly impacting performance. It's like trimming down an F1 car to remove excess weight, improving its agility and speed. The benefits of model pruning include reduced memory usage, faster inference times, and lower power consumption. These advantages are crucial for deploying models on resource-constrained environments, like mobile devices or edge servers. Several studies have demonstrated the effectiveness of model pruning. For example, the paper "The Lottery Ticket Hypothesis" by Frankle and Carbin shows that pruned…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
(Locked)
Principles of model pruning5m 1s
-
(Locked)
Demo: Pruning the chatbot model8m 19s
-
(Locked)
Theory and practice of model distillation6m 58s
-
(Locked)
Demo: Applying model distillation to the chatbot8m 38s
-
(Locked)
Understanding and implementing quantization6m 34s
-
(Locked)
Demo: Quantizing the chatbot model5m 35s
-
(Locked)
Demo: Overview of the results10m 47s
-
(Locked)
Solution: Prepare the chatbot for deployment11m 12s
-
(Locked)
-
-
-
-