Skip to content

Conversation

@tstadel
Copy link
Member

@tstadel tstadel commented Dec 22, 2025

Related Issues

  • fixes not being able to select multiple output values (except all) for the LLM. E.g. this is how we want it to be configurable via UI
image

Currenlty it's not possible to make multiple selections for LLM (e.g. as in the sketch above) which increases UI complexity heavily.

Proposed Changes:

  • support multiple keys in Tool's outputs_to_string similar to outputs_to_state

How did you test it?

  • added unit tests

Notes for the reviewer

Checklist

  • I have read the contributors guidelines and the code of conduct
  • I have updated the related issue with new insights and changes
  • I added unit tests and updated the docstrings
  • I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
  • I documented my code
  • I ran pre-commit hooks and fixed any issue
@tstadel tstadel requested a review from a team as a code owner December 22, 2025 18:02
@tstadel tstadel requested review from vblagoje and removed request for a team December 22, 2025 18:02
@vercel
Copy link

vercel bot commented Dec 22, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Review Updated (UTC)
haystack-docs Ignored Ignored Preview Dec 23, 2025 3:13pm
@coveralls
Copy link
Collaborator

coveralls commented Dec 22, 2025

Pull Request Test Coverage Report for Build 20440323754

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 10 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.004%) to 92.264%

Files with Coverage Reduction New Missed Lines %
core/pipeline/async_pipeline.py 2 66.67%
tools/tool.py 2 98.02%
components/tools/tool_invoker.py 6 96.2%
Totals Coverage Status
Change from base Build 20377983188: 0.004%
Covered Lines: 14204
Relevant Lines: 15395

💛 - Coveralls
@vblagoje
Copy link
Member

Thanks @tstadel , taking a look today!

Copy link
Member

@vblagoje vblagoje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tstadel If you could please add a bit of pydocs for this use case in tool.py that would be amazing icing on the cake for this enhancement. In reno note just another sentence or two with example (before - how it was not possible, and now how it is, where is the setting etc etc) will be super helpful to users and and as a history log for us as well.

The function that will be invoked when the Tool is called.
Must be a synchronous function; async functions are not supported.
:param outputs_to_string:
Optional dictionary defining how a tool outputs should be converted into a string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add example here for this use case @tstadel ?

@github-actions github-actions bot added the type:documentation Improvements on the docs label Dec 23, 2025
@tstadel
Copy link
Member Author

tstadel commented Dec 23, 2025

@vblagoje if you want to have a final look over the added docs, I've updated them.

@vblagoje
Copy link
Member

vblagoje commented Dec 29, 2025

@tstadel it looks good to me. However, as this is a crucial component, unless it is urgent I'd ask @anakin87 and perhaps @sjrl to have one last look. We need to think about the impact this change will have on agent - state - tool interactions. Have you played with this while prepping this PR? All good?

@tstadel
Copy link
Member Author

tstadel commented Dec 29, 2025

@tstadel it looks good to me. However, as this is a crucial component, unless it is urgent I'd ask @anakin87 and perhaps @sjrl to have one last look. We need to think about the impact this change will have on agent - state - tool interactions. Have you plaid with this while prepping this PR? All good?

Sounds good. Additionally to the tests in this PR (which of course ensure backward compatibility as this is just an extension) we'll infer the required state schema on platform side. For open source you need to set the state as before.

Comment on lines 39 to +41
:param outputs_to_string:
Optional dictionary defining how a tool outputs should be converted into a string.
If the source is provided only the specified output key is sent to the handler.
If the source is omitted the whole tool result is sent to the handler.
Example:
```python
{
"source": "docs", "handler": format_documents
}
```
Optional dictionary defining how tool outputs should be converted into string(s).
Supports two formats:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the docstrings in all our tools, so also ComponentTool and PipelineTool

Comment on lines +23 to +25
function=search_func, # Returns {"documents": [Document(...)], "metadata": {"count": 5}, "debug_info": {...}}
outputs_to_string={
"formatted_docs": {"source": "documents", "handler": format_documents},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if we provided code examples for these two functions, even if they are dummy functions. Two functions in question are format_documents and search_func

function=search_func, # Returns {"documents": [Document(...)], "metadata": {"count": 5}, "debug_info": {...}}
outputs_to_string={
"formatted_docs": {"source": "documents", "handler": format_documents},
"summary": {"source": "metadata", "handler": lambda m: f"Found {m['count']} results"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sold on providing an example that uses a lambda function since this isn't encouraged by us since it will fail at serialization. So calling tool.to_dict() with this example will fail.

Could we also create a dummy handler function here as well?

Comment on lines +127 to +128
else:
for key, config in self.outputs_to_string.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a dev comment here?

Suggested change
else:
for key, config in self.outputs_to_string.items():
else:
# Multiple outputs configuration
for key, config in self.outputs_to_string.items():
tool_result_str = self._default_output_to_string_handler(tool_result)
return ChatMessage.from_tool(tool_result=tool_result_str, origin=tool_call)
except Exception as e:
error = StringConversionError(tool_call.tool_name, output_to_string_handler.__name__, e)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new structure of the code could lead to a state where the output_to_string_handler is not defined when this exception is triggered.

I'm wondering if we should restructure this method a little bit. The point of raising StringConversionError was to catch errors specifically coming from calling tool_result_str = output_to_string_handler(value) and I think that is a little muddied with your version.

I wonder if a two-tiered try-except blocks would be better. So one specifically around calls to output_to_string_handler(value) and then another around the whole code block if you think that's still necessary. What do you think?

Comment on lines +330 to +331
# If no source key is provided, we use the output key itself
source_key = config.get("source", output_key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of a sneaky feature that isn't mentioned anywhere in the docs. Is this really necessary to add or could we follow how the logic works for the single output format.

for output_key, config in outputs_config.items():
# If no source key is provided, we use the output key itself
source_key = config.get("source", output_key)
value = result[source_key]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could cause a KeyError since there is no guarantee that the source_key actually exists in the result dict. That's why we used result.get(source_key) above. Was this intentional?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

topic:tests type:documentation Improvements on the docs

5 participants