Before running your first experiment, upload your dataset directly to Braintrust. This unlocks cross-experiment comparison at the data-point level. You can click on one input and see how every experiment handled that exact case, side by side.
About us
Braintrust is the AI observability platform helping teams measure, evaluate, and improve AI in production. By connecting evals and observability in one workflow, teams at Notion, Stripe, Zapier, Vercel, and Ramp ship quality AI products at scale.
- Website
-
https://braintrust.dev/
External link for Braintrust
- Industry
- Software Development
- Company size
- 51-200 employees
- Headquarters
- San Francisco
- Type
- Privately Held
- Founded
- 2023
Locations
-
Primary
Get directions
San Francisco, US
Employees at Braintrust
Updates
-
That's a wrap on Trace. Thank you to the speakers who shared their stories and the builders who joined us. Now it's back to shipping quality AI. Couldn't make it? Catch the keynote on YouTube → https://lnkd.in/grvYcZJH
-
Braintrust reposted this
Braintrust is closing the loop from observability to evals to optimization, automatically surfacing patterns and acting on them. At Trace we announced a bunch of new products, including a Gateway, a CLI, updates to Brainstore and Loop, and Topics, which group traces into meaningful patterns so you can see what's actually happening. Thanks to all the customers who joined us today and to the team for putting on a great conference.
Observability tells you what happened. Evals tell you whether it's getting better. Braintrust connects the two by automatically surfacing patterns and acting on them. New from the Trace keynote: - Topics finds insights and errors - Loop enables human-powered remediation - Braintrust Gateway standardizes access and observability across model providers - Braintrust CLI brings the UI to the terminal Read more → https://lnkd.in/g7dHEkdZ
-
-
Observability tells you what happened. Evals tell you whether it's getting better. Braintrust connects the two by automatically surfacing patterns and acting on them. New from the Trace keynote: - Topics finds insights and errors - Loop enables human-powered remediation - Braintrust Gateway standardizes access and observability across model providers - Braintrust CLI brings the UI to the terminal Read more → https://lnkd.in/g7dHEkdZ
-
-
Teams running AI applications in production generate thousands of traces a day, but can't read them all. Braintrust is launching Topics to automatically group traces into meaningful patterns, so you can see what's actually happening. Topics is accessible directly in the platform, so you're not relying on static reports. Build better datasets, generate inputs to your evals, and understand when new problems show up in production. Read more → https://lnkd.in/gaJTqixY
-
Braintrust reposted this
Braintrust team has a rule: refuse to do repetitive work. So, they built Custom Agents to automate it across every team. Now, hours of manual work is done… before anyone logs on. "What's cool about agents in Notion is that they allow everyone in an org, whether you're technical or non-technical, to do the same kind of automation as a coding agent." — Ankur Goyal, CEO
-
Trace is tomorrow. Can't make it? Catch the keynote later on YouTube → https://lnkd.in/g2-n_SfG
-
Braintrust reposted this
Evals are the bookends of great AI development. 📚 I’ve found that Braintrust is the ultimate partner for that journey. Agent observability is truly a team sport, and their platform bridges the gap perfectly between Devs, PMs, UX, and Ops. If you aren’t doubling down on evals, you’re just guessing.
To scale their voice agents across the globe, Navan built a continuous eval loop that observes production calls, understands the data, creates more evaluations, then refines the system. They now supervise hundreds of calls a day, and achieved a macro F1 score of >0.9. Read more → https://lnkd.in/g-VTpX6u
-
Braintrust reposted this
Ankur Goyal has been building AI products long before ChatGPT. Now, as the Founder and CEO of Braintrust, which announced their $80M Series B last week, he's helping developers ship AI that actually works. Early Braintrust investor Saam Motamedi sat down with Ankur to talk about why evals are the foundation of AI product development, lessons learned when scaling GTM, and why Braintrust takes recruiting as seriously as customer obsession. Ankur also shares how Braintrust earned the trust of companies like Stripe, Instacart, and Airtable by making a deliberate bet on a small number of high-taste customers and going all in. Link to the full video in comments below.