If AI Can Play Dungeons & Dragons, It Can Run Your ERP

A solo D&D experiment turned into a lesson in building agentic AI apps, connecting LLM prompt tuning, Model Context Protocol and Spring AI.

May 28th, 2025 6:00am by Michael Coté

Featued image for: If AI Can Play Dungeons & Dragons, It Can Run Your ERP

Shutterstock

I’ve been playing a lot of Dungeons & Dragons again for the first time in 30 years. Not with humans, but with AIs, and I’ve mostly used ChatGPT, Claude and Gemini. Surprisingly, playing D&D with the AIs turned out to be the perfect way to understand what agentic AI actually is and how it might work in real-world enterprise software.

Enterprise systems are complex. So is D&D. You’ve got character creation, endless spell lists, NPC (nonplayer character) logic, encounter rules — it’s a tangle of workflows and state transitions that rivals, maybe even exceeds, most business processes. Navigating a dungeon with sentient fungus is arguably harder than validating a purchase order.

So here’s my theory: If the robot can play D&D, it can easily play enterprise software.

The Undifferentiated ChatDM

So, how good is AI at running D&D? Surprisingly good. I’ve been solo role-playing with ChatGPT and Claude for almost two years now. I think of the AI Dungeon Master as the ChatDM.

At first, it was a dancing bear: clumsy, but impressive just for trying. The ChatDM has gotten better since I started with ChatGPT 3.5. The ChatDM can now handle D&D’s complex, often tedious combat mechanics and even picks up on the kinds of stories, style and tone I like. Its grasp of D&D rules and lore is impressive, and it can reason through ambiguous edge cases, like whether a halfling can hide behind a statue in daylight.

But like most AI tools, it only works well if you put in the time. You have to learn how the model “thinks,” write careful prompts and keep iterating. I do a lot of coaching and out-of-game nudging to steer the ChatDM.

One early lesson: You have to spell out the obvious. If goblins are hiding in the bushes, the ChatDM should not say, “Unknown to Sven the Gnome, goblins lie in wait.” That ruins the game. You have to teach it basic genre sense.

When models update, things usually improve, but you have to retest all that coaching you built up.

The ChatDM also isn’t very creative. It defaults to clichés because that’s what it’s trained on. Ask it to start a game, and you’ll be in a tavern with a mysterious hooded figure sitting in a dim corner. Or walking through a forest, you’ll hear a tinkling noise and see flickering lights in the tree line. It gives you what’s likely and what’s been imagined already, not something original.

In enterprise terms: If you use the same base models as everyone else, you’ll get the same bland outputs. That might be fine for generic tasks like meeting notes or basic customer service analysis. But if you’re aiming for differentiation, you need proprietary data and you have to keep feeding it. Constantly.

I haven’t trained my own models — that’s above my pay grade. But D&D has taught me this: If you want unique results, bring unique context. Otherwise, it’s always the same hooded figure in the corner of your factory.

The App Still Matters … the Most

Over the past year, the chat apps themselves — ChatGPT and Claude — have massively improved the ChatDM experience. Early on, I had to copy and paste everything into a session: the adventure notes, character sheets, even the running game log. That was fine until the context window filled up and things started to break. Like when the cobbler, who definitely promised to sew hidden pockets into my boots, suddenly forgot he’d ever met me.

There’s still some bookkeeping involved, but the newer app features help a lot. With projects, I can store world settings, characters and adventure files that stay connected to the chat. And the memory features are finally getting good at tracking long-running context.

The enterprise lesson? The model matters, sure, but the app around the model matters more. That layer is cheaper, faster to build than a model, and it’s where your real differentiation happens. Anyone can use the same LLM, but your surrounding logic, data and UX are where you get unique value.

The Valuable Work Starts Above the Model

So how do you actually get AI working at the application layer?

The obvious move is embedding AI into your apps. Toolkits like Spring AI show clear patterns for doing just that. And if you’re like most enterprise developers, you’re probably using Java, and probably Spring. In that case, adding AI is a lot like adding a database or any other service. As long as your platform provides AI capabilities, you’re good to go.

But “agentic AI” is a different beast. I’m still not sure what that term’s supposed to mean or what it will come to mean. The basic idea is to give AIs the tool’s reasoning capabilities to choose a string of actions that accomplishes some task. That’s a very different approach than just calling an API. The problem with these so-called “new paradigms,” to use the term precisely and with irony, is that nobody shows you how to build them. The glossy marketing is all about outcomes, not implementation.

That’s where Model Context Protocol (MCP) comes in. Writing a bit of code to extend the ChatDM with MCP gave me a hands-on way to explore how agentic AI actually works. Not the idealized, boardroom version of AI, but the messy, practical reality. In other words: a test to see if this is the real deal or just well-produced mystery meat.

An Agentic AI Spellbook: Model Context Protocol

I’ve spent a good chunk of time writing MCP code for the ChatDM, and what I’ve gotten is a better sense for how we get from the vision of agentic AI to a more practical how.

First off: As the name suggests, MCP is a protocol, not code. It defines a set of concepts, an architectural structure, even a style and a way for AIs to call out to tools, fetch data and automate prompting. It’s a spec, not an implementation. You still have to build the actual code that makes it all go.

I don’t know how it happened, but MCP has exploded in popularity in an incredibly short time. There are now hundreds of implementations (called “servers”) and growing community momentum. That, more than anything, is what makes it exciting right now: People are actually building with it. And, yes, it feels like it’s a well-thought-through spec, but only success over time can comment on that.

There are plenty of deep dives on MCP out there, but I’ll give you the quick tour.

Understanding the Agentic Stack

Model Context Protocol is built around “concepts.” These cover both the actors in the system and the functionality they expose. If you’re looking at the Spring AI MCP diagram above, you’ve already seen most of them.

Clients and Servers

The two core actors are the MCP client and MCP server. These names are a little confusing at first. The client is the AI itself — say, the Claude desktop app, or a headless agent running behind the scenes. The server provides functionality to the AI. I think of it as a plugin more than a “server” — not technically precise, but it helps my brain.

Tools

The most-used concept is a tool: a callable function like “roll a die,” “check the weather” or “look up the price of eggs in Harris County.” Tools are where most of the “do something” power lives.

Resources

Then you’ve got resources — static content like files, databases or structured text. These are straightforward. But Claude’s implementation is still clunky, which takes some shine off the idea of querying a database of dragon names midsession.

Prompts

Prompts are the most interesting and the most overlooked. On the surface, a prompt is just a saved input you can send to the AI. Why not just type it yourself? Because prompts can dynamically reference tools and resources, letting you customize behavior in real time.

Think of it like this: You hand a chef a vague recipe (“make me a sandwich”), then toss over some ingredients from your fridge (resources) and some kitchen gear (tools). The chef reasons out what to do. If you’re lucky, you get a Reuben. That’s agentic AI.

Unearthed Arcana

There are other concepts like sampling, where the server can call back into the client to run a prompt, but support is thin. I haven’t played with them yet, but they sound fun and slightly dangerous.

Next, I’ll walk through how I used MCP to upgrade my ChatDM with tools, what I learned and some patterns that might generalize to your own agentic AI projects.

Coding With MCP Java SDK and Spring AI

Like I said, MCP is just a specification, not code. So what does the actual implementation look like?

Well, it looks like code. But not complicated code. Most MCP examples are small tasks: Check the weather, read a file, maybe retrieve some code. That’s how a protocol works: You wire up handlers and let the AI call them. That makes a lot of MCP server examples look underwhelming at first glance.

I started with the official Java SDK, the descriptively named MCP Java SDK. It grew out of work by the Spring AI team soon after MCP was released and is now integrated with Spring AI. With that in place, I could create and spin up a standard Spring Boot app and start wiring in AI tools very quickly.

And with that toolkit in place, I could finally play some D&D.

Rolling Dice

Well OK, maybe not exactly playing by rolling dice but getting close by coding dice.

The first tool I built was a dice roller. These days, the ChatDM is getting better at simulating real dice, often by generating and running a quick Python script behind the scenes. But early on, I’m pretty sure they were just hallucinating rolls. Creating a dice roller that supports all of the simple and then wacky ways of rolling dice in D&D is a good first MCP tool.

A simple roll might be d6 — roll a six-sided die and return the result. Slightly more complex is something like 3d6 + 5. You can probably figure that one out. But things get messy fast. In modern D&D, you often roll with advantage or disadvantage — you roll twice and take the higher or lower number. If you’re making a combat roll with a d20, a natural 1 is a critical failure (something really bad happens), and a 20 is a critical success (you double the value of the dice you roll for damage, but not the modifiers like +7). And that’s just D&D. Other systems use custom symbols — runes even! — instead of numbers. Some people “roll” with poker cards. “Dice” gets weird quick in role-playing games.

Adding a dedicated tool ensures that rolls happen consistently and correctly — no hallucinated math. You could get the AI to generate code on the fly, but it’s far easier to use a library like Dice Notation Tools for Java that handles both common and esoteric rolls.

Wiring it into an MCP tool with Spring AI is simple:

If you ignore the imports and logging, the code isn’t doing much. That’s the point. With Spring AI, you can stand up an MCP server with minimal effort — no config boilerplate, no wiring madness. (I left out “one line of code” you have to add to tell Spring to load the tool, but it’s kind of boring to go over: Feel free to witness its magnificence if it so pleases you).

Once the tool is running, we can finally use official D&D mechanics to figure out if that staggering figure has some bizarre fungal infestation or is just drunk at 9:14 a.m.:

If we were only doing dice rolling, agentic AI wouldn’t be too interesting. Fun to code, sure, but not world-shaking.

In solo D&D, there’s something called an oracle. You use it when you want to answer a question randomly, usually to keep yourself honest and add some unpredictability. Asking, “Is there a goblin hiding in the bushes?” and going with the roll is a lot more fun than deciding yourself.

Implementing an oracle as an MCP tool is simple. Codewise, it just picks from a list of answers: yes, no, maybe, etc. But as with the dice roller, the real work isn’t in the function, it’s in the natural language that explains the tool to Claude.

This is another classic “just one line of code” situation, except the important part is the description. That’s what tells the AI what the tool does, when to use it and why it might matter in context.

In both of these examples, you get to the first real insight from writing MCP tools: The natural language interface matters more than the code.

In these cases, code just calls a dice-rolling library or selecting a random line from a file. Granted, that dice-rolling library is doing a ton of heavy lifting — it’s the classic “one line of code” that hides 5,000 lines of actual functionality. Reading files must be at least double that!

But the @description annotations are the interesting part. Those are what Claude (the MCP client) uses to understand what the tools do, how to use them and, presumably, when to use them. So you’re not just writing documentation in the traditional sense. You’re teaching and not just what your tools are and how to use them, but why the tools exist so it can reason out why and when to use them.

MCP Tools Are Just the Start

Tools are the easiest MCP concept to build and the most satisfying. They’re also the best-supported in Claude and other clients. The other two concepts, resources and prompts, don’t see as much use yet. That said, you can mimic both with tools: A tool can load files, query databases or even return a prompt and tell the client to run it.

That kind of overloading seems fine, at least based on the vibes I get from the spec. But it’d be a lot cooler if more MCP servers supported resources and prompts properly, as intended. I’ll get into that next time. There’s a lot more to explore.

In the meantime, you can see me walking through creating MCP tools, resources and prompts in my (at the moment) four-part video series. Spring AI makes it super easy and quick. And doing all this has given me a better grounding in the whole “AI is going to destroy our creativity and livelihoods” panic—so I’ll cover that too.

For now, I’m off to check if there’s a goblin hiding behind that boulder, and whether it has a poison-tipped spear in its hand or a fresh-baked, warty apple pie.

Michael Coté studies how large organizations get better at building software to run better and grow their business. His books "Changing Mindsets," "Monolithic Transformation" and "The Business Bottleneck" cover these topics. He’s been an industry analyst at RedMonk and 451...