From the course: OpenAI API and MCP Development

Tracing, monitoring, and evaluating with LangSmith

From the course: OpenAI API and MCP Development

Tracing, monitoring, and evaluating with LangSmith

As a conclusion for this module, I want to present you one platform which is Lungsmith that we use for observability and evaluation. This one is designed specifically for applications built with large language models. It is useful for understanding how these systems powered by AI work and behave, as they become increasingly difficult and complex in size and functionalities. And Lungsmith allows to address these challenges by providing a deep visibility and analysis into model interactions and outputs, prompt execution also, and also details like tool usage and agent decision making. And that's going to be useful when you want to trace details and outputs for applications built with multi-agents, like our last project, which is a multi-agents developer team. And we have three agents for this one. So that's interesting to understand how agents interact, how it works, when it is powered by AI. So you have different options to authenticate, to log in. And I'm going to use my GitHub account to log in. So that's very quick and straightforward. So let me click on this one as well. All right. So let's see how we can get started. So you can go to here, back. and you should land here actually. This is the place where you can land and you can go to tracing. You'll see that I already have a few projects up and running and I want to create a new one. So I'm going to go here to new projects. Here you go. So you've got all the details to get started. And I'm going to select OpenAI Agents SDK. And we're going to follow the steps to first create an API key, then go to install the dependencies, configure the project with the correct environment variables. So you need an API key for Langsmith here, and also a project ID. And then you have some starter projects to get started. All right, so let's begin with actually here. So you'll see that this is already available in your existing project. So let's go check out in the readme file. So you have this, let's go back up. Actually here, I'm gonna add it for you. So you have here the library for Autogen with OpenAI. So let's go back. We're going to add this one to our projects here. So just follow the instructions for Longsmith anyway. We're going to go through the steps again to configure our projects. And here you have the .ev example with what you need to add in your project. So let's go ahead and do that. we're gonna replace with one API key and our project ID. You'll see that this is pretty straightforward. So for the project ID, that's gonna be this one, which is a combination of words and one two-digit number. Let's go here. Oh, my bad. That's actually here for the project ID, line five. And then for the generation of the API key, you just need to click here in the same location, everything in the same location. Very quick. Let's copy it and go back to our project. And also, remember that for an API key, you need to use it always locally and only on your machine, like privately. For obvious reason, this secret key needs to be used only privately for your unique projects. Otherwise, you need to revoke it if it has been exposed publicly. Here we go. So now we are almost all set. We have just configured our project. So let's go through the steps to here, start our project. So I already have my virtual environment up and running. What I need to do is to install Longsmith OpenAI Agents SDK. All right, so once it is done, we can start tracing. So it's not actually configured in our existing project. So let's go back to the documentation for Longsmith. And here you can click here to start tracing for an existing project. All right. So let's begin with adding this to the scope of the project. I'm going to add this and we're going to use traceable for every function that we want to trace. Let's go check this out. So basically we can here trace every function that we want to run. And usually what happens is that it returns an output. All right, and for our project specifically, the way that we have set up things, we have three agents and we have one main function, line 101, actually 100, but now 101, where we can add traceable. And this main function does everything. So this is where you start and run the workflow. So the team of agents, you run it using one project description until it reaches the final output. So for now, what we've done is just outputting the results in the console, so it doesn't do much. What we could do is also allow to return here. Yeah, I think I'm going to do it here. I'm going to allow to return the content for every step. Like here, content, if it is task one completed, and also task two, and then the final output. Otherwise, we're not going to be able to see anything if we don't return any outputs from this function. And I think that we have enough interactions here, so I may allow to have more than 10 to make sure that we print enough interactions between whenever we run the team. All right, so let's try that. I think we are all set. We can give it a try. We can try running it and see how it works. Basically, what we're trying to achieve here is to see how our application works by using tracing and observability. So let's do Python main.py. All right, so now we've got task one complete. I think that we had a few outputs, so let's go back to the dashboard. So you're gonna see this running, telling you that your agent's workflow has been traced successfully. So let's go here, close this panel, and go to our project. So the same that we have just created and registered. And we want to see the details of the traces. You can actually see the task one. All right, so you see the input. The input is not specified because this function doesn't take any input actually, but we can see the details of task one with an API overview, which is actually the job and task, which has been assigned to the first agent. And if you want to see more, because that would be useful, is to actually allow for more interactions. Remember that what we want is to see task two and also final output. And this is something that we could actually allow to achieve by allowing more interactions. So maybe we could put it 100 here. Depends on how the flow is running. It's really up to the agents powered by AI. And this is why it can be quite challenging to work with AI because you never know how to adjust, how to improve based on evaluation and compare. So I think it is a very useful tool to use LongSmith to see how you can improve instead of just doing it manually. So I'm going to let this here. The purpose was to show you how you can use this tracing tool like LongSmith for observability. Because by integrating with frameworks such as LongChain and like OpenAI, the OpenAI SDK in this example. You can integrate this in custom language model pipelines and allow developers to trace the execution. You can allow to debug failures, compare, and evaluate the model outputs in order to identify any errors or inconsistencies in the execution of your program, and also evaluate performance over time. Instead of just treating language models as and opaque black boxes. Lungsmith allows to turn these projects into inspectable, testable, and also improvable components of a larger software system. So imagine that you have a team of workflow with many more agents, and as your application grows in complexity and functionalities, it is useful to have one way to observe and inspect how it works by tracing the details, allowing you to compare and evaluate also the outputs and see how you can adjust to actually improve the function, the way it works. So this LongSmith platform makes it an essential tool for teams to build reliable, scalable, and production-ready AI applications.

Contents