Skip to content

[Feature Request] Knowledge Retrieval Events in Stream Output #5873

@wzqwtt

Description

@wzqwtt

Problem Description

When stream_events=True is enabled in the Agent's run method, the framework yields intermediate step events for various operations during the agent execution loop.

However, intermediate steps that occur during the message construction phase (inside _get_run_messages()) are not yielded to the event stream, even though some of these operations are conceptually equivalent to tool calls.

Current Behavior:

The _get_run_messages() method (line 8783 in /libs/agno/agno/agent/agent.py) is responsible for constructing the messages to send to the model. During this process, it performs several operations that could be considered "intermediate steps":

  1. Knowledge base retrieval: When add_knowledge_to_context=True, the _get_user_message() method calls get_relevant_docs_from_knowledge() at line 8548
  2. Memory retrieval: When add_user_memories_to_context=True, it retrieves user memories from the memory manager (line 8289-8303)
  3. Cultural knowledge retrieval: When add_culture_to_context=True, it retrieves cultural knowledge (line 8335)
  4. Other context-building operations

Since _get_run_messages() is a regular function (not a generator), these operations execute silently without yielding any events to the stream.

Example Use Case - Knowledge Base Retrieval:

Consider a RAG (Retrieval-Augmented Generation) agent with a large knowledge base:

  • User sends a query: "What are the best practices for authentication?"
  • The agent calls get_relevant_docs_from_knowledge() which takes 3-5 seconds to search and retrieve relevant documents
  • These documents are added to the system prompt before the LLM call
  • Current behavior: No events are emitted during this retrieval, creating a "black box" period where the user has no visibility
  • Desired behavior: Emit events showing that knowledge retrieval is in progress, similar to how tool calls are streamed

Impact:

  • Lack of visibility into potentially time-consuming operations during message construction
  • Inconsistent event coverage - post-message operations emit events, but pre-message operations do not
  • Difficult to debug or monitor performance of knowledge retrieval, memory loading, and other context-building operations
  • Poor user experience when these operations take significant time (users see no feedback)

Proposed Solution

The core issue is that _get_run_messages() is a regular function that cannot yield events. Operations during message construction are conceptually similar to tool calls but are currently invisible to the event stream.

My approach is: make _get_run_messages() Streamable. Convert _get_run_messages() (and its related methods like _get_user_message(), get_system_message()) into generator functions that can yield events during execution.

Alternatives Considered

No response

Additional Context

No response

Would you like to work on this?

  • Yes, I’d love to work on it!
  • I’m open to collaborating but need guidance.
  • No, I’m just sharing the idea.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions