-
Notifications
You must be signed in to change notification settings - Fork 623
Open
Labels
Description
When performing a streaming Generate() call that includes tools, the final response from the model only includes the tool response ignoring the reasoning or previous messages generated by the model.
Code to reproduce the issue
var streamedOutput string
final, err := genkit.Generate(ctx, g,
ai.WithPrompt("what is a gablorken of value 2 over 3?"),
ai.WithTools(gablorkenTool),
ai.WithStreaming(func(ctx context.Context, chunk *ai.ModelResponseChunk) error {
for _, content := range chunk.Content {
streamedOutput += content.Text
}
return nil
}))
if err != nil {
t.Fatal(err)
}
// Verify final output matches streamed content
finalOutput := final.Text()
if streamedOutput != finalOutput {
t.Errorf("Streaming output doesn't match final output\nStreamed: %s\nFinal: %s",
streamedOutput, finalOutput)
}
// Output
Streamed: I can help you calculate the gablorken! Based on your question, you want to calculate a gablorken with value 2 over 3. Let me use the gablorken calculation tool for you. The gablorken of value 2 over 3 is 8.
Final: The gablorken of value 2 over 3 is 8.In the streamed response, the received chunks include the messages prior the tool call execution and the tool response in the final generation.
In this case, only the model messages are missing but it is the same situation for reasoning messages or intermediate tool responses messages. In other words, the model replies back with the last response from the last generate call.
Root cause
The roots of this issue can be found here
- Initially, the
respvariable contains the original response message from the modelpriorthe tool execution (here). - Then, the flow continues until reaching the point where Genkit needs to see if there are tool requests to be handled (here)
- If there are no tools needed, the
generate()call returns with the original response (here) - But if there were tools that had to be handled, a new
generate()request gets triggered with a new request message (here). - The return value contains only the response messages from that last
generatecall, omitting the originalrespmessages (thoughts or model messages).
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
No status