Thor Olavsrud
Senior Writer

Salesforce unveils simulation environment for training AI agents

News
Nov 14, 20255 mins

eVerse uses synthetic data generation, stress testing, and reinforcement learning to train AI voice and text agents on simulated tasks prior to deployment.

Salesforce logo on building
Credit: Tada Images / Shutterstock

Salesforce AI Research today unveiled a new simulation environment for training voice and text agents for the enterprise.

Dubbed eVerse, the environment leverages synthetic data generation, stress testing, and reinforcement learning to optimize agents. The company says eVerse is one more step in its journey to enterprise general intelligence (EGI), the creation of AI that is optimized for business applications and excels in capability and consistency.

“Even with the amazing progress that we’ve seen over the past years, AI systems are still prone to mistakes,” says Silvio Savarese, EVP and chief scientist of Salesforce Research. “For many enterprises, getting 90% or 95% accuracy is not acceptable. We need to have 99% accuracy. We need to get AI systems that are safe to use and that are trustworthy.”

One sticking point to fully leveraging autonomous AI agents involves what Salesforce calls “jaggedness” or “jagged intelligence,” in which AI systems that excel at complex tasks unexpectedly fail at simpler ones that humans reliably solve. Typically, Savarese said, jaggedness reveals itself in business use cases that require “common sense” reasoning.

Solving the jaggedness issue requires a new approach: learning from experience. AI must incorporate feedback from the environment and users into its learning process.

“It’s not about training bigger and bigger models,” Savarese said. “LLMs are currently trained using millions of billions of text data tokens, but this approach is leading us to a saturation point where improvement has been marginal.”

But a new approach based on learning from experience creates a new challenge: Letting agents learn on the fly, while deployed in production, is risky.

“We want to perform learning training before deployment by building simulation environments within which agents can be tested, evaluated, and improved until the desirable level of performance is achieved,” Savarese said.

Simulation with synthetic data

To that end, Salesforce’s eVerse simulates realistic enterprise environments populated with synthetic data or metadata that mimics real-world customer data. Teams can put agents through their paces in eVerse. Agents perform tasks in the simulated environment and teams can measure failure modes and successful outcomes and iterate until the agent is ready for production deployment.

Salesforce used eVerse to develop Agentforce Voice, which the company announced in October. Agentforce Voice helps organizations build voice-enabled agents that can handle complex conversations in real-time. Prior to launch, Salesforce put Agentforce Voice through thousands of simulated conversations using eVerse. The simulations in eVerse’s virtual environments help teams train and stress test voice agents with background noise, accents, crosstalk, and other complexities inherent in real-world operations.

“Think about bad cell phone connections, background noise, or heavy accents like my Italian accent,” Savarese said. “What we need are AI agents that can handle this in a way that is fluent, natural, and consistent. With eVerse, we can build these kinds of realistic voice interactions through simulations and use those simulations to evaluate how these agents behave.”

Healthcare pilot

Salesforce customer UCSF Health has also piloted eVerse. Clinical experts at the Californian hospital have been using eVerse to train and refine AI agents for healthcare billing. Sara Murray, MD, VP and chief health AI officer at UCSF Health, said eVerse has helped UCSF’s teams simplify one of the most complex parts of healthcare.

UCSF Health sees about 2.5 million outpatient visits per year.

“If you’ve received any healthcare in the last decade, you’ve probably known it’s very overwhelming at times,” Murray said. “We get about 9,000 billing inquiries monthly, where our human billing agents are spending thousands of hours annually just answering the phone.”

The questions patients have are typically complex. It requires very specialized billing agent guidance to help patients. The hospital created an agent capable of handling about 70% of questions based on initial data provided, but it still had to refer patients to a human agent for help about 30% of the time.