André Lindenberg’s Post

MindStudio measured the MCP tax many developers carry blind. Every connected server injects its full tool schema into every message … not at session start, every turn. Four servers burn 15,000–20,000 tokens per turn … 9% of a 200k context window before the first real prompt. Estimation: tokens ≈ (tools × 200) + (description chars ÷ 4). Project-scoped configs and pruning mitigate it … something like lazy tool loading at the protocol level could fix it. #MCP #contextEngineering #tokenOptimization #claudeCode

  • graphical user interface, text, application

Feels like we need “tool attention” mechanisms, only surface schemas when the model is likely to use them.

I think saying that it "burn 15,000–20,000 tokens per turn" sends the wrong message. It adds 20k tokens to the start of the conversation and the way that LLM works means you're sending the conversation back and forth.

We took that into account when we built this - check this out - www.actionpilot.co.uk ActionPilot sits between your models and any MCP-connected system — policy, simulation, isolation and audit, before a single row is touched. Bring your own LLM. Bring any MCP server, including Zoho MCP.

Like
Reply

fair point, context bloat can act like a one-time upfront fee rather than a per-turn surcharge, so reframing the problem helps teams focus their token-trimming efforts where it counts most. Maybe test your next config change with a quick before-and-after context window check to see what’s really moving the needle

Like
Reply

There is a project that adresses this issue - https://portofcontext.com/

Loading MCP servers always is a lazy approach that breaks the models awareness. On the contrary if the schema is not there the model does not know the tool exists and cant choose to call it. What I found best is loading a lightweight index, which could be tool names plus a one line description of when it should be called (50 tokens per tool for example) and then only pulling full schemas when invocated.

I have created a Standard to address the token tax problem and provide enhanced security for MCP Planning. The Standard is called the Agent Capability Standard, released Apache 2.0 for anyone to use. https://github.com/kahalewai/acs ACS solves the token tax problem by defining a two-layer capability exposure architecture: a token-minimal Capability Summary Index that provides planning-relevant awareness, and on-demand Capability Manifests that deliver full planning contracts without invocation-layer metadata. At higher conformance levels, ACS defines a Planning Authorization Enforcement Point (PAEP) that governs capability visibility prior to planner reasoning, eliminating the planning-layer reconnaissance surface created by pre-authorization schema injection. Please feel free to use this Standard in your MCP implementations, or reach out if you have any questions. Hope this helps someone!

You can try to use API right from the Claude code input and then compare to the MCP connection, you can have your opinion

Like
Reply

MCP skills being worked on to reduce context bloat

See more comments

To view or add a comment, sign in

Explore content categories