From the course: AI Evaluations: Foundations and Practical Examples
Unlock this course with a free trial
Join today to access over 25,200 courses taught by industry experts.
Defining evaluation criteria from MVP to GA
From the course: AI Evaluations: Foundations and Practical Examples
Defining evaluation criteria from MVP to GA
- We just learned how can you actually check the quality of your agents, not just accuracy and citation quality, but how the quality of responses, are they helpful, honest, and harmless, and actually come up with numbers. But numbers doesn't build products, products are built by people. And as you are learning AI agents, the new normal in AI agents is to launch them before they are perfect, because it's hard to build perfect AI agents. And that's where you're seeing that you need to iterate with real customers, with real contracts, to actually improve your AI agents. Does this mean you just launch your AI agent and then learn? Or there is a method here where you can take minimum risk, but add maximum value. That's what we're going to talk in this module. So let's get started on that. So your first release is going to be a measurement launch. Here, you will launch to 1%, or even less than 1% of your customer if you have a huge customer base. I'm talking like thousands of users, not…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
-
(Locked)
Decomposing AI agents into evaluative components4m 6s
-
(Locked)
Identifying high-risk or hard-to-evaluate components5m 10s
-
(Locked)
Manual evaluation with criteria8m 14s
-
(Locked)
Defining evaluation criteria from MVP to GA4m 58s
-
(Locked)
Hands-on lab: Vibe code auto evaluations using Cursor8m 29s
-
(Locked)
Hands-on lab: Automating AI evaluation using LLM as judge9m 27s
-
(Locked)
-
-