Skip to content

from langchain.agents.middleware import SummarizationMiddleware it's DEFAULT_SUMMARY_PROMPT has some wrong #34517

@lsc13020528317

Description

@lsc13020528317

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-cli
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-perplexity
  • langchain-prompty
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Example Code (Python)

from langchain.agents.middleware import SummarizationMiddleware
from langchain.agents import create_agent

summary_middleware = SummarizationMiddleware(
    model=summary_llm,
    max_tokens_before_summary=512,
    messages_to_keep=0,
)

agent = create_agent(
        name="test_summary_agent",
        model=model,
        middleware=[
            summary_middleware,
        ],
    )

model_input = "your test input " # token length should > 1024, make sure summary_middleware will work
result = agent.invoke(input={"messages": [{"role": "user", "content": model_input}]})

Error Message and Stack Trace (if applicable)

Error code: 400 - {'error': {'message': "This model's maximum context length is 262144 tokens. However, your request has 272099 input tokens. Please reduce the length of the input messages. None", 'type': 'BadRequestError', 'param': None, 'code': 400}}

-------------------
In fact, the SummarizationMiddleware's summary input is this (contain unrelated model usage message):
<role>
Context Extraction Assistant
</role>

<primary_objective>
Your sole objective in this task is to extract the highest quality/most relevant context from the conversation history below.
</primary_objective>

<objective_information>
You're nearing the total number of input tokens you can accept, so you must extract the highest quality/most relevant pieces of information from your conversation history.
This context will then overwrite the conversation history presented below. Because of this, ensure the context you extract is only the most important information to your overall goal.
</objective_information>

<instructions>
The conversation history below will be replaced with the context you extract in this step. Because of this, you must do your very best to extract and record all of the most important context from the conversation history.
You want to ensure that you don't repeat any actions you've already completed, so the context you extract from the conversation history should be focused on the most important information to your overall goal.
</instructions>

The user will message you with the full message history you'll be extracting context from, to then replace. Carefully read over it all, and think deeply about what information is most important to your overall goal that should be saved:

With all of this in mind, please carefully read over the entire conversation history, and extract the most important and relevant context to replace it so that you can free up space in the conversation history.
Respond ONLY with the extracted context. Do not include any additional information, or text before or after the extracted context.

<messages>
Messages to summarize:
[HumanMessage(content="在网络上搜索与'死锁'相关的技术文档、故障案例和处理指南,重点关注反汇编层面的分析方法和调试手段,提取典型故障现象、定位步骤、解决方案等信息。", additional_kwargs={}, response_metadata={}, id='f73acfae-f4e4-4709-8f7c-b9f8d7647ff6'), AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 5512, 'total_tokens': 5539, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_provider': 'openai', 'model_name': 'Qwen3-235B-A22B', 'system_fingerprint': None, 'id': 'chatcmpl-88b48347-4b11-48a9-8042-fc7753b6c65f', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--019b5eb5-8bd7-7611-8076-f60398467090-0', tool_calls=[{'name': 'search_pages', 'args': {'keyword': '死锁', 'in_space': False}, 'id': 'chatcmpl-tool-99d91ac2a6b2785e', 'type': 'tool_call'}], usage_metadata={'input_tokens': 5512, 'output_tokens': 27, 'total_tokens': 5539, 'input_token_details': {}, 'output_token_details': {}}),

Description

when the summary_middleware work(input messages token length > 1024), it will give the model input with DEFAULT_SUMMARY_PROMPT in "from langchain.agents.middleware import SummarizationMiddleware" like this:

Context Extraction Assistant

<primary_objective>
Your sole objective in this task is to extract the highest quality/most relevant context from the conversation history below.
</primary_objective>

<objective_information>
You're nearing the total number of input tokens you can accept, so you must extract the highest quality/most relevant pieces of information from your conversation history.
This context will then overwrite the conversation history presented below. Because of this, ensure the context you extract is only the most important information to your overall goal.
</objective_information>

The conversation history below will be replaced with the context you extract in this step. Because of this, you must do your very best to extract and record all of the most important context from the conversation history. You want to ensure that you don't repeat any actions you've already completed, so the context you extract from the conversation history should be focused on the most important information to your overall goal.

The user will message you with the full message history you'll be extracting context from, to then replace. Carefully read over it all, and think deeply about what information is most important to your overall goal that should be saved:

With all of this in mind, please carefully read over the entire conversation history, and extract the most important and relevant context to replace it so that you can free up space in the conversation history.
Respond ONLY with the extracted context. Do not include any additional information, or text before or after the extracted context.

{messages}

WARN:
the "{messages}" in this prompt will be like:
HumanMessage(content="XXXXXXXX", additional_kwargs={}, response_metadata={}, id='f73acfae-f4e4-4709-8f7c-b9f8d7647ff6'), AIMessage(content='', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 5512, 'total_tokens': 5539, 'completion_tokens_details': None, 'prompt_tokens_details': None}, 'model_provider': 'openai', 'model_name': 'Qwen3-235B-A22B', 'system_fingerprint': None, 'id': 'chatcmpl-88b48347-4b11-48a9-8042-fc7753b6c65f', 'finish_reason': 'tool_calls', 'logprobs': None}, id='lc_run--019b5eb5-8bd7-7611-8076-f60398467090-0', tool_calls=[{'name': 'search_pages', 'args': {'keyword': 'XXXXX', 'in_space': False}, 'id': 'chatcmpl-tool-99d91ac2a6b2785e', 'type': 'tool_call'}], usage_metadata={'input_tokens': 5512, 'output_tokens': 27, 'total_tokens': 5539, 'input_token_details': {}, 'output_token_details': {}}),


ERROR:
When the length of the messages token is very close to the maximum context input limit, any additional and irrelevant model usage information (such as tool call or usage statistics) can cause the total token count to exceed the limit, resulting in a context length error and causing the agent execution to fail!


I expected its output to be a correct summary of the agent's historical context, but in reality, it exceeds the maximum length and throws an error.
So I think it's a bug that need to fix.

System Info

System Information

OS: Linux
OS Version: #1 SMP Wed Jun 10 09:04:49 EDT 2020
Python Version: 3.11.3 (main, Apr 19 2023, 23:54:32) [GCC 11.2.0]

Package Information

langchain_core: 1.2.4
langchain: 1.2.0
langsmith: 0.4.43
langchain_anthropic: 1.3.0
langchain_mcp_adapters: 0.1.13
langchain_openai: 1.0.3
langgraph_api: 0.5.23
langgraph_cli: 0.4.7
langgraph_runtime_inmem: 0.18.1
langgraph_sdk: 0.2.9

Optional packages not installed

langserve

Other Dependencies

anthropic: 0.75.0
blockbuster: 1.5.25
click: 8.3.1
cloudpickle: 3.1.2
cryptography: 44.0.3
grpcio: 1.76.0
grpcio-tools: 1.75.1
httpx: 0.28.1
jsonpatch: 1.33
jsonschema-rs: 0.29.1
langgraph: 1.0.3
langgraph-checkpoint: 3.0.1
mcp: 1.21.2
openai: 2.8.1
opentelemetry-api: 1.38.0
opentelemetry-exporter-otlp-proto-http: 1.38.0
opentelemetry-sdk: 1.38.0
orjson: 3.11.4
packaging: 25.0
protobuf: 6.33.1
pydantic: 2.11.10
pyjwt: 2.10.1
pytest: 7.3.1
python-dotenv: 1.2.1
pyyaml: 6.0.3
requests: 2.32.5
requests-toolbelt: 1.0.0
rich: 14.2.0
sse-starlette: 2.1.3
starlette: 0.49.3
structlog: 25.5.0
tenacity: 8.2.2
tiktoken: 0.12.0
truststore: 0.10.4
typing-extensions: 4.15.0
uuid-utils: 0.12.0
uvicorn: 0.38.0
watchfiles: 1.1.1
zstandard: 0.25.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugRelated to a bug, vulnerability, unexpected error with an existing featurelangchain`langchain` package issues & PRs

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions