From the course: AI Evaluations: Foundations and Practical Examples

Unlock this course with a free trial

Join today to access over 25,200 courses taught by industry experts.

Demo of fully functional human and auto-evaluator systems

Demo of fully functional human and auto-evaluator systems

From the course: AI Evaluations: Foundations and Practical Examples

Demo of fully functional human and auto-evaluator systems

- [Instructor] I know there are a lot of challenges in building AI agents for production. In this course, what we will do next is as we go along, we'll build real applications like this chat bot where you could upload a contract, ask questions, and get answers. Not only that, we'll show you how can you actually use these experts to build the vertical agents like legal agent here and use the knowledge from them to set up manual evaluation. Not only just accuracy, but how subjective things like helpfulness, harmless, honesty can be articulated as something that you can judge and set your evaluations. Further, we will look at not only just human expert, but how can you scale these evaluations. As I'm showing here, we will use an LLM judge to actually produce these values rather than just calling human agents to do our evaluations. And then we will figure out how can you put all the logging to check what is the latency in your system, how many tokens it consumes, when it fails, when it…

Contents