Optimizing your prototype for scalability

From the course: AI Orchestration: Developing and Testing Your AI Prototype

Start my 1-month free trial Buy for my team

Optimizing your prototype for scalability

“

- [Instructor] It's time now to focus on making our sentiment analysis MVP model ready for real-world deployment by optimizing for performance and scalability. So, let's take a look at how we can do this. First, we need to know our current model's footprint and speed, which I'm doing using my measure_model_performance function. It simply takes our saved model, runs inference through it, and calculates the inference time. And I have some code here that runs this function and then measures the initial performance. If we look at the output below, we can see that our original model's inference time is 146 milliseconds and the size of the model is 255 MBs. Well, that's a good starting point. Next, we'll be looking at some optimization approaches. So, let's start with the first strategy, which is TorchScript, or JIT compilation. This is fairly simple. What I'll be doing here is using a traced model, which reduces the model size and perhaps even the model inference speed. This may run into…

Unlock this course with a free trial

Join today to access over 25,600 courses taught by industry experts.

Optimizing your prototype for scalability

From the course: AI Orchestration: Developing and Testing Your AI Prototype

Optimizing your prototype for scalability

Download courses and learn on the go

Contents

Start learning today.

Explore Business Topics

Explore Creative Topics

Explore Technology Topics