André Lindenberg’s Post

1mo

MindStudio measured the MCP tax many developers carry blind. Every connected server injects its full tool schema into every message … not at session start, every turn. Four servers burn 15,000–20,000 tokens per turn … 9% of a 200k context window before the first real prompt. Estimation: tokens ≈ (tools × 200) + (description chars ÷ 4). Project-scoped configs and pruning mitigate it … something like lazy tool loading at the protocol level could fix it. #MCP #contextEngineering #tokenOptimization #claudeCode

21 Comments

André Lindenberg 1mo

Claude Code MCP Servers and Token Overhead: What You Need to Know … https://www.mindstudio.ai/blog/claude-code-mcp-server-token-overhead

5 Reactions

Anuj Sadani 1mo

Feels like we need “tool attention” mechanisms, only surface schemas when the model is likely to use them.

3 Reactions

Bill Easton 1mo

I think saying that it "burn 15,000–20,000 tokens per turn" sends the wrong message. It adds 20k tokens to the start of the conversation and the way that LLM works means you're sending the conversation back and forth.

1 Reaction

Gavin Smith 1mo

We took that into account when we built this - check this out - www.actionpilot.co.uk ActionPilot sits between your models and any MCP-connected system — policy, simulation, isolation and audit, before a single row is touched. Bring your own LLM. Bring any MCP server, including Zoho MCP.

J.D. Salbego 1mo

fair point, context bloat can act like a one-time upfront fee rather than a per-turn surcharge, so reframing the problem helps teams focus their token-trimming efforts where it counts most. Maybe test your next config change with a quick before-and-after context window check to see what’s really moving the needle

Prashant Patel 1mo

There is a project that adresses this issue - https://portofcontext.com/

1 Reaction

Fernando Pastor Alonso 1mo

Loading MCP servers always is a lazy approach that breaks the models awareness. On the contrary if the schema is not there the model does not know the tool exists and cant choose to call it. What I found best is loading a lightweight index, which could be tool names plus a one line description of when it should be called (50 tokens per tool for example) and then only pulling full schemas when invocated.

1 Reaction

Shawn Kahalewai Reilly 1mo

I have created a Standard to address the token tax problem and provide enhanced security for MCP Planning. The Standard is called the Agent Capability Standard, released Apache 2.0 for anyone to use. https://github.com/kahalewai/acs ACS solves the token tax problem by defining a two-layer capability exposure architecture: a token-minimal Capability Summary Index that provides planning-relevant awareness, and on-demand Capability Manifests that deliver full planning contracts without invocation-layer metadata. At higher conformance levels, ACS defines a Planning Authorization Enforcement Point (PAEP) that governs capability visibility prior to planner reasoning, eliminating the planning-layer reconnaissance surface created by pre-authorization schema injection. Please feel free to use this Standard in your MCP implementations, or reach out if you have any questions. Hope this helps someone!

1 Reaction

Vladimir O. 1mo

You can try to use API right from the Claude code input and then compare to the MCP connection, you can have your opinion

👓🌍 Ian (GeoAR it®) 1mo

MCP skills being worked on to reduce context bloat

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Fernando Pastor Alonso
1mo Edited
Report this post
Loading every MCP server upfront is the lazy approach and breaks the model’s awareness of what’s actually relevant. But the opposite fails too. If the schema isn’t there the model doesn’t know the tool exists and can’t choose to call it. What I found works best is loading a lightweight index, tool names plus a one line description of when each should be called (around 50 tokens per tool). Full schemas only get pulled when the tool is actually invoked. Same principle applies to memory. Presence in context should be earned by relevance, not loaded “just in case.”
André Lindenberg

Agents, Graphs, Ontologies
1mo

MindStudio measured the MCP tax many developers carry blind. Every connected server injects its full tool schema into every message … not at session start, every turn. Four servers burn 15,000–20,000 tokens per turn … 9% of a 200k context window before the first real prompt. Estimation: tokens ≈ (tools × 200) + (description chars ÷ 4). Project-scoped configs and pruning mitigate it … something like lazy tool loading at the protocol level could fix it. #MCP #contextEngineering #tokenOptimization #claudeCode
Like Comment
To view or add a comment, sign in
Ahmed Orko Nur
1mo
Report this post
Integrating Claude and NotebookLM is a true game changer. After going through a bunch of tutorials on how to do the integration through an MCP server - this one worked for me without any glitches: https://lnkd.in/eMz4FSiX

NotebookLM Claude Integration | MCP Servers · LobeHub lobehub.com
Like Comment
To view or add a comment, sign in
khurram Shahzad
1mo
Report this post
Bcachefs 1.38 has been released by Kent Overstreet, introducing a range of performance, scalability, and stability improvements to this modern copy-on-write filesystem designed to compete with Btrfs and ZFS.

Bcachefs 1.38 Released With Faster Mounts and Discard Fixes https://vmorecloud.com
Like Comment
To view or add a comment, sign in
Pasha Sviderski
1mo
Report this post
Shipped Uncloud v0.19 - an open-source tool for deploying apps across your own servers with a familiar Docker Compose config 🎉 This release comes with quality-of-life improvements for deploys and more commands for troubleshooting: → Failed deploys now print the latest logs to speed up debugging (screenshot 👇) → uc machine logs - stream logs from systemd services on remote machines without SSHing → uc machine rtt - monitor mesh network health with round-trip latency between machines → Nightly builds for the impatient to play with unreleased features Try it out 👉 https://lnkd.in/g9qrV7fv
Like Comment
To view or add a comment, sign in
Jonathan Cavell
1mo
Report this post
Anyone else who grew up writing applications in a good old fashioned client/server model having a hard time fully wrapping their heads around "MCP server" as a concept? I keep having to remind myself that an MCP server isn't a server at all... the code doesn't sit somewhere and get "served"... I have to run it with my agents on their infrastructure. Maybe I'm alone on this, but it's giving me a very weird mental block.

10 Comments
Like Comment
To view or add a comment, sign in
Joe Rinehart
3w
Report this post
Wow. I just slashed my Claude usage in a specific environment where an MCP server had been generated off of Protobuf services. The culprit? Verbose descriptions on every field and RPC. 22k token cost.
Like Comment
To view or add a comment, sign in
Overpass Apps - Making Apps and Games For Businesses

389 followers
3w
Report this post
Node.js 22's built-in `--watch` flag means you can restart your dev server without `nodemon`. One less dependency to maintain. Sometimes the platform just catching up is the best feature drop.
Like Comment
To view or add a comment, sign in
André Lindenberg
1mo
Report this post
Install codeburn, then run codeburn optimize. That command scans your coding agent sessions for specific waste ... repeated file reads, low read-to-edit ratios, uncapped bash output, unused MCP servers ... and returns token savings estimates with copy-paste fixes. Grades your workflow A-F, reads JSONL and SQLite from disk, covers seven providers. The monitoring TUI ships with it, but the optimizer is why it sticks. #codeburn #tokenOptimization #agenticCoding #openSource
36 Comments
Like Comment
To view or add a comment, sign in
Girish Manchaiah
1mo
Report this post
#codeburn #tokenOptimization #agenticCoding #openSource Note: Also codeburn optimize scans your sessions and your ~/.claude/ setup for the most common waste patterns and hands back exact, copy-paste fixes.
André Lindenberg

Agents, Graphs, Ontologies
1mo

Install codeburn, then run codeburn optimize. That command scans your coding agent sessions for specific waste ... repeated file reads, low read-to-edit ratios, uncapped bash output, unused MCP servers ... and returns token savings estimates with copy-paste fixes. Grades your workflow A-F, reads JSONL and SQLite from disk, covers seven providers. The monitoring TUI ships with it, but the optimizer is why it sticks. #codeburn #tokenOptimization #agenticCoding #openSource
Like Comment
To view or add a comment, sign in
Steven Rostedt
3w
Report this post
So my main server was immune to Copy Fail, CF2 and Dirty Flag. That was basically because I build the latest stable kernel for it monthly, and do so with "make localmodconfig". Hence, the needed modules were not even built for it.

14 Comments
Like Comment
To view or add a comment, sign in

53,067 followers

View Profile Follow

André Lindenberg’s Post

More from this author

The Map Is Only the Surface

The Log Is the Agent: Event-Sourced Architectures Take the Stack

The Substrate Shrinks: When 31GB of RAG Becomes 4GB on a Laptop

Explore content categories