From the course: AI Orchestration: Developing and Testing Your AI Prototype
Unlock this course with a free trial
Join today to access over 25,600 courses taught by industry experts.
Optimizing your prototype for scalability
From the course: AI Orchestration: Developing and Testing Your AI Prototype
Optimizing your prototype for scalability
- [Instructor] It's time now to focus on making our sentiment analysis MVP model ready for real-world deployment by optimizing for performance and scalability. So, let's take a look at how we can do this. First, we need to know our current model's footprint and speed, which I'm doing using my measure_model_performance function. It simply takes our saved model, runs inference through it, and calculates the inference time. And I have some code here that runs this function and then measures the initial performance. If we look at the output below, we can see that our original model's inference time is 146 milliseconds and the size of the model is 255 MBs. Well, that's a good starting point. Next, we'll be looking at some optimization approaches. So, let's start with the first strategy, which is TorchScript, or JIT compilation. This is fairly simple. What I'll be doing here is using a traced model, which reduces the model size and perhaps even the model inference speed. This may run into…