Skip to content

read_file for images: structuredContent.content (base64) is dumped as text by Claude Code CLI — large context bloat #521

Description

@carrotRakko

TL;DR: When DC read_file is called on an image, the resulting structuredContent.content (image base64) is serialized as text by Claude Code CLI and injected into the model's context. The image is also rendered natively, so the text-side base64 is duplicate, and [OUTPUT TRUNCATED - exceeded 25000 token limit] kicks in. The primary cause is on Claude Code CLI's side (documented with protocol-level proof in anthropics/claude-code#70280), but since DC is one of the most commonly-used MCP servers with Claude Code, opening this here to start a discussion on what (if anything) DC could do server-side to mitigate until the client-side fix lands.

Reproduction

Environment:

  • Claude Code 2.1.186 (CLI, in a Docker-isolated dev container)
  • DC v0.2.42 (@wonderwhy-er/desktop-commander)
  • Model: Opus 4.7

Calling read_file on a 146 KB PNG returns:

Claude Code CLI serializes the entire structuredContent to text and feeds it to the model. With CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS=25000, the resulting text hits [OUTPUT TRUNCATED - exceeded 25000 token limit].

Measured context consumption (controlled A/B with the same 146 KB PNG, two subagents in the same parent session):

Tool path subagent_tokens tool_result text size
Read(image) + Read(.b64 of same image) 39,199 ~22.5 KB (built-in Read auto-truncates the .b64)
DC read_file(image) — 1 call 106,356 ~93.6 KB (structuredContent JSON+base64 dumped as text)

DC's read_file consumes ~2.7× the built-in baseline for the same image, with the image rendered natively in both cases.

Where the primary fix belongs

Claude Code CLI deduplicating image content between content[] and structuredContent is the correct fix. anthropics/claude-code#70280 documents this with protocol-level proof and an A/B across different MCP servers (a FastMCP Python server with auto-structured_output, and DC — TypeScript with manually-constructed structuredContent). Either entry path hits the same client-side dump, so server author guidance alone can't fully prevent it.

Why also opening this on the DC side

Until #70280 lands, users on Claude Code CLI + DC for image-heavy workflows (screenshots, OCR, UI inspection) take a real context hit per call. DC being one of the most-used MCP servers with Claude Code makes the impact non-trivial.

This Issue is to start a discussion on whether DC could offer a server-side mitigation. Some directions worth considering — explicitly without proposing any one of them as the right answer:

  • Make the image-branch structuredContent.content / imageData opt-in via a config flag or env var (default keeps current Cowork behavior)
  • Switch the image bytes in structuredContent to a resource_link or URI reference instead of inline base64
  • Use audience annotations on the content blocks to signal "for preview widget, not the LLM" (spec-compliant signaling either way)
  • Detect the client (DC already tracks currentClient) and adapt the payload — e.g. trim structuredContent.content when the client is Claude Code CLI

We plan to send a PR taking one of these directions; opening this Issue first to anchor the design discussion before code lands.

Related:

✍️ Author: Claude Code with @carrotRakko (AI-written, human-approved)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions