-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Centralize truncation in conversation history #5652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
38241e5 to
df2fabc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
codex/codex-rs/core/src/client_common.rs
Lines 71 to 84 in 30faba1
| pub(crate) fn get_formatted_input(&self) -> Vec<ResponseItem> { | |
| let mut input = self.input.clone(); | |
| // when using the *Freeform* apply_patch tool specifically, tool outputs | |
| // should be structured text, not json. Do NOT reserialize when using | |
| // the Function tool - note that this differs from the check above for | |
| // instructions. We declare the result as a named variable for clarity. | |
| let is_freeform_apply_patch_tool_present = self.tools.iter().any(|tool| match tool { | |
| ToolSpec::Freeform(f) => f.name == "apply_patch", | |
| _ => false, | |
| }); | |
| if is_freeform_apply_patch_tool_present { | |
| reserialize_shell_outputs(&mut input); | |
| } |
The new truncation flow expects function_call_output content sent back to the model to be plain text with Exit code/Wall time headers, and the added integration tests assert that these outputs are not valid JSON. However, Prompt::get_formatted_input only converts the JSON payload produced by format_exec_output_for_model into a structured plain‑text string when a freeform apply_patch tool is present (client_common.rs lines 71‑84). For normal sessions that use only the standard shell tool, the reserialization step never runs, so the next prompt still contains the original JSON blob. This leaves the function_call_output valid JSON (and without the expected line prefixes), causing the new tests to fail and the model to see a different format than intended. The reserialization should happen whenever shell outputs are present, not only when the freeform apply_patch tool is configured.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let truncated_by_bytes = content.len() > MODEL_FORMAT_MAX_BYTES; | ||
| let marker = if omitted > 0 { | ||
| Some(format!( | ||
| "\n[... omitted {omitted} of {total_lines} lines ...]\n\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a bit wrong. You are counting metadata lines and not just shell output lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true. it would be +3 or sth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once all comments are processed, this is cool
But are we 200% sure the formatting is exactly equivalent for the tool call results? This is constrained by the training of the model so we must be certain we use the same in training and prod
move the truncation logic to conversation history to use on any tool output. This will help us in avoiding edge cases while truncating the tool calls and mcp calls.