-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Description
Feature Description
We need a deterministic “communication layer” between LlamaIndex agents and tool execution.
Core need
-
Deterministic input preprocessing before tool execution
- Example: an MCP billing tool expects an
idfield, butidmust be filled/overridden deterministically (e.g., derived from authenticated session/customer context) — NOT chosen or hallucinated by the LLM.
- Example: an MCP billing tool expects an
-
Deterministic output postprocessing after tool returns and before the agent sees it
- Example: tool returns ~300 fields; we need deterministic filtering down to 3 fields to avoid context bloat and keep agent behavior stable.
Observed limitation
With current agent+tools patterns, once tools are registered the agent generates tool-call JSON/args and invokes tools without an obvious native “intercept/transform” layer for:
- tool-call input args (pre)
- tool output (post)
in a way that is reusable across tools (cross-cutting concern), and especially clean for MCP tools.
Proposed feature (medium-detail, not a full spec)
Support one (or a combination) of:
- Agent/tool execution hook or middleware to transform tool inputs/outputs deterministically (pre/post).
- First-class tool wrapper/adapter API pattern documented and supported for IO transforms (so it’s ergonomic + consistent).
- Maintainer guidance on preferred placement: agent runner vs tool runner vs callback/event layer.
Minimal code example (current workaround vs desired)
Current workaround (per-tool wrapper; works but doesn’t scale well across many tools and doesn’t feel “first-class”):
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.core.tools import FunctionTool
# Deterministic value from outside the LLM (e.g., auth/session layer)
SESSION_CUSTOMER_ID = "cust_123456"
def call_mcp_billing(payload: dict) -> dict:
"""Placeholder: call MCP tool server and return raw JSON (~300 fields)."""
# ... network call ...
return {"balance": 12.34, "due_date": "2025-12-31", "status": "OPEN", "lots_more": "..."} # etc.
def billing_lookup(question: str, id: str | None = None) -> dict:
# PRE: deterministically override/ignore LLM-provided `id`
payload = {"id": SESSION_CUSTOMER_ID, "question": question}
raw = call_mcp_billing(payload)
# POST: deterministically shrink tool output before it hits agent context
return {k: raw.get(k) for k in ("balance", "due_date", "status")}
billing_tool = FunctionTool.from_defaults(
fn=billing_lookup,
name="billing_lookup",
description="Get billing info for the authenticated customer."
)
agent = FunctionAgent(tools=[billing_tool])
# await agent.run("Why is my bill higher this month?")Desired ergonomics (illustrative API; not current):
agent = FunctionAgent(
tools=mcp_tools,
tool_io_middleware=[
ToolIOMiddleware(
before=lambda tool_name, args: override_id(args, SESSION_CUSTOMER_ID),
after=lambda tool_name, out: filter_fields(out, tool_name),
)
],
)Notes
-
LlamaIndex already supports
FunctionToolcallbacks that can override tool output content, but this does not address a general, reusable pre/post IO transform layer across arbitrary tools/tool specs (especially MCP). Docs/examples:- FunctionTool reference: https://developers.llamaindex.ai/python/framework-api-reference/tools/function/
- FunctionTool callback example: https://developers.llamaindex.ai/python/examples/tools/function_tool_callback/
Reason
We need deterministic tool I/O transforms for reliability and predictable integrations in production agent systems (MCP is a motivating case, but this is broadly useful for any tool calling).
Why this matters
- Reliability & correctness: LLM-chosen identifiers/fields can be wrong or inconsistent. Deterministically overriding fields like
idprevents “random” tool calls and removes a class of failures. - Compliance/auditability: For regulated environments, having a deterministic tool-call envelope (and a deterministic “view” of tool outputs that the agent consumes) improves auditability and reduces risk.
- Cost/latency + quality: Tool outputs with hundreds of fields create context bloat, increasing token usage and failure modes. Deterministic filtering (e.g., 300 → 3 fields) is a cross-cutting concern.
- Separation of concerns: Forcing every tool to implement its own filtering/override logic duplicates effort and makes behavior inconsistent.
Workarounds being explored (with tradeoffs)
- Put pre/post logic inside the MCP tool itself (e.g., add
fields_to_return) — shifts responsibility into every tool and doesn’t solve cross-cutting concerns. - Use agent only as a tool selector; execute tool calls deterministically outside the agent — reduces the value of integrated agent/tool patterns.
Related issues searched (not exact duplicates, but adjacent)
- Tool output / observability / callback-related discussions: [Bug]: Output from tool calls are not very observable #9770
- MCP + agents usage questions: [Question]: Ollama LLM with Agents not calling MCP Server tool #19023
- Capturing raw tool outputs from workflows (adjacent to “postprocessing” needs): [Question]: how to store tool's raw result from agent's tool calling in a workflow #19722
Keywords searched: “tool pre-processing”, “tool post-processing”, “tool middleware”, “agent tool hooks”, “callbacks tool input”, “tool output filtering”, “MCP”, “deterministic tool call”.
Value of Feature
What this unlocks
- A consistent, maintainable way to enforce deterministic tool-call envelopes (e.g., override/fill
idfrom trusted context) without relying on the LLM. - Deterministic, tool-specific (or global) output shaping to reduce context bloat and improve stability (e.g., 300 fields → 3 fields).
- Cleaner MCP tool integrations where the agent remains responsible for “what to do next”, but the communication layer stays deterministic and policy-compliant.
Expected impact
- Reduced token usage + faster responses by controlling tool output size before it reaches the agent.
- Fewer production incidents caused by malformed/undesired tool args.
- Clearer architecture guidance: where do input/output transforms belong (agent runner vs tool runner vs callback/event layer)?
What I’m asking maintainers for
- Either a first-class hook/middleware mechanism for tool IO transforms, OR
- A documented “blessed” wrapper/adapter pattern that is ergonomic, composable, and works across ToolSpecs (including MCP), with guidance on best practice placement in the stack.