From the course: AI Pricing and ROI: A Technical Breakdown
Introduction to AI as an API
From the course: AI Pricing and ROI: A Technical Breakdown
Introduction to AI as an API
- [Instructor] The magical experience of Gen AI via an API is that if you can dream an idea, you can usually prototype it. This is a stark comparison with many more traditional ML tools, which require exploration, tuning, and adjustment. Let's go through some questions to ask when assessing an AI as an API. These factors include our existing ones, time to market, integration points, available data, team structure, domain areas, and compliance requirements. Now we're going to focus on the factors that are important for an AI API so we can merge these requirements together. This video will focus on time to market, reliability, cost, latency, multi-tenancy, and customizability. The biggest advantage for AI as an API is time to market. The faster you can launch your product to market, the bigger your advantage. An early example is language translation as a way to easily integrate AI into your application. Since then, we have applications like traffic prediction, image classification, and many more. This allows us to validate our ideas quickly and figure out which type of AI we actually need. Moving on to our second factor, which is reliability. In the early days of OpenAI, the ChatGPT API often failed. This made depending on the API a challenge. Since then, the OpenAI APIs have gotten a lot more reliable, and you can check the reliability on their website at status.openai.com. Now what makes reliability complex is that the model not only has to offer uptime, but also has to offer the correct answer and have the usual latencies that we expect. If the AI API is very core to your business, it might be important to consider a multi AI strategy where you might use multiple AI providers to make sure that your service is at least working. In the ChatGPT case, OpenAI has a program called the Foundry program where you can pay extra to have additional capacity that's pre reserved. Now API costs can quickly add up, especially in the AI context. Gen AI is particularly expensive given the nature of the models. However, what's interesting is that the cost of Gen AI has been down, falling over 600 times from 2021 to March 2024 when this video was recorded. These data points are for GPT-3 or ChatGPT equivalents. These costs have been decreasing for a number of reasons. The cost of hardware has fallen, programs have gotten more optimized, and there's also a lot more investment money allowing companies to subsidize the cost of their APIs. Now our next factor is latency. When you have your use case figured out, it's always possible to run smaller and cheaper models, which might be cheaper than using an AI API. Now smaller and custom models also benefit from lower latency. Some applications require low millisecond latency like trading or recommendation systems. And these latencies are infeasible with general APIs. But for most use cases, especially in the Gen AI world, models like ChatGPT-3.5 are fast enough for most applications. And in a globalized and multi-user world, we also need to consider multi-tenancy. This is typically most important for B2B and international companies. APIs might be geo-limited or have limits in organizational structure either for visibility or compliance. So it's very important to understand how your application functions and how you'd integrate with these APIs. As an example, some AI APIs have tenancy limits, meaning the number of dedicated applications that you provide to your users might be limited. As a personal example, the company I work for, VoiceFlow, we had to manage a number of conversational AI applications for our customers. At the beginning, we used a vendor API, but as we started to scale, the vendor API could no longer maintain our pace of growth. So eventually we deployed an in-house model and system to deal with this. So we used the vendor API to start and validate the problem and scaled with our own systems when we ran into issues, in our case, the vendor API took us through the tens of thousands of applications level, and we scaled ourselves through the hundreds of thousands. Now this customizability might be really important to you, so it really depends on knowing your requirements. Sometimes you might need to configure the API in a specific way that just is impossible. Now overall, we've gone over these six factors. Time to market, liability, cost, latency, multi-tenancy, and customizability. All these are crucial for selecting the right AI API for your use case. Now AI APIs are a great first option for accelerating your first AI implementation. In the next video, we'll introduce you to AI as a platform, how it can be used for broader use cases.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
Introduction to AI as an API4m 53s
-
(Locked)
Introduction to AI as a platform2m 19s
-
(Locked)
Setup costs for AI APIs54s
-
(Locked)
Ongoing costs for AI APIs3m 10s
-
(Locked)
Estimating cost for a translation feature2m 11s
-
(Locked)
Estimating cost for a RAG solution: What is RAG?3m 15s
-
(Locked)
Estimating cost for a RAG solution: Costs of RAG5m 23s
-
(Locked)
Estimating costs for an image generation feature1m 53s
-
(Locked)
Challenge: Estimating the cost of a book summarization39s
-
(Locked)
Solution: Estimating the cost of a book summarization3m 8s
-
-
-
-
-
-
-
-