Skip to content

Handle large exec_command output and surface scan errors as TUI toasts#583

Open
mhspektr wants to merge 18 commits into
usestrix:mainfrom
mhspektr:feature/issue-579
Open

Handle large exec_command output and surface scan errors as TUI toasts#583
mhspektr wants to merge 18 commits into
usestrix:mainfrom
mhspektr:feature/issue-579

Conversation

@mhspektr

@mhspektr mhspektr commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

NB: this builds on #577 and should be merged after

  • Large output handlingexec_command results are now capped at STRIX_MAX_TOOL_OUTPUT_CHARS (default 65 536 chars). JSON arrays are trimmed to the first 50 records (valid JSON preserved); plain text keeps the first 300 lines. Both truncation paths include a header with total size and count so the agent retains full context awareness. Set STRIX_MAX_TOOL_OUTPUT_CHARS=0 to disable.
  • MapReduce LLM compression — when a single chunk still overflows after truncation, a MapReduce pipeline compresses it via the LLM before it reaches the context window, preventing hard overflows.
  • Scan error toasts — scan thread failures now post a persistent Textual toast immediately rather than surfacing as a post-exit traceback. A timeout=0 bug that caused the toast to expire on mount was also fixed (Textual's default ~5 sis now used).

Closes #579

mhspektr and others added 16 commits June 18, 2026 22:45
- Change RuntimeError to TypeError for type validation in report/writer.py
- Update pyupgrade to v3.21.2 for Python 3.14 compatibility
Mirror the layout introduced on feature/438-token_budget: pytest +
pytest-asyncio dev deps, asyncio_mode auto, a tests.* mypy override, and
pytest in the mypy pre-commit hook deps so the tests/ package type-checks.
…ix#492)

Large local targets were copied into the sandbox file-by-file via the SDK
LocalDir entry, which stalls on big repos and could leave /workspace empty.

- --mount <path> bind-mounts a host directory read-only at /workspace/<subdir>
  instead of copying it, bypassing the per-file stream.
- A size pre-flight (STRIX_MAX_LOCAL_COPY_MB, default 1024) fails fast with a
  clear message suggesting --mount when a non-mounted local target is too big.
An empty or whitespace-only --mount value resolves to the current working
directory and would silently bind-mount it into the sandbox. Reject it.
If the same directory is passed via --target and --mount (or as duplicate
values), it previously produced two targets — copied AND bind-mounted, and
the copied one could trip the size pre-flight. Dedupe by resolved path,
preferring the bind mount.
Previously a value of 0 (or negative) made every local target count as
oversized, aborting all local scans. Now <= 0 disables the pre-flight.
os.walk silently swallowed directory-listing errors, so a permission-denied
subtree could make a large repo under-count and slip past the pre-flight.
Surface such omissions via an onerror warning.
Add CLI reference + example for --mount, document the size pre-flight env var,
note the read-only-is-not-a-hard-boundary caveat and that remote repos are not
size-checked, and clarify the backends docstring on when bind mounts apply.
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
ContextWindowExceededError carries status_code=400, which matched the
_INPUT_REJECTION_CODES guard and triggered image-strip retry logic.
Image stripping cannot reduce token count, so the agent wasted up to
3 retry cycles before parking as failed.

Add an explicit isinstance check before the status-code guard to detect
context-window overflow and park the agent as 'failed' immediately.
Scan errors were stored in _scan_error and re-raised only after the
user closed the TUI, making the app appear stuck while producing a
confusing post-exit traceback.

Extract _notify_scan_error and call it from all three error branches
in _start_scan_thread so a persistent Textual toast is shown
immediately when the scan thread fails.
…missal

Textual's Notification expires when raised_at + timeout - time() <= 0.
With timeout=0 the toast expired immediately on mount, making the error
notification invisible to users. Drop the timeout argument to use
Textual's default display duration (~5 s).
exec_command output is now capped at STRIX_MAX_TOOL_OUTPUT_CHARS (default
65536). JSON arrays are parsed and trimmed to the first 50 records (valid
JSON preserved); plain text keeps the first 300 lines. Both include a
header showing the total size and record/line count so the agent retains
full awareness of what was produced.

A secondary character cap prevents single very long lines from bypassing
the line limit. Set STRIX_MAX_TOOL_OUTPUT_CHARS=0 to disable.
Replaces plain truncation with a parallel summarisation pipeline when
exec_command output exceeds STRIX_MAX_TOOL_OUTPUT_CHARS (default 65536).
Output is split at JSON record or line boundaries; each chunk is
summarised via litellm.acompletion; summaries are consolidated into a
single result that fits the context window.

Falls back to truncate_exec_result on any compression error.

Closes usestrix#579
Code review findings:

- factory._wrap_exec_command: add post-compression backstop that calls
  truncate_exec_result when compress_exec_result returns an oversized
  result (happens when _split produces 1 chunk, e.g. a single very
  large JSON record — compress_large_output correctly returns it
  unchanged, but the result was never guarded against threshold after).

- tests/agents/test_factory_helpers.py (new): unit tests for
  _resolve_model (run_config vs settings fallback, None/whitespace
  error cases, non-string Model object guard) and _extract_task_hint
  (valid cmd, missing cmd, non-string cmd, invalid JSON, non-dict JSON).
@greptile-apps

greptile-apps Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds three self-contained features on top of PR #577: structure-aware truncation and MapReduce LLM compression for oversized exec_command outputs, a ContextWindowExceededError fast-path in the agent run loop that bypasses the (unhelpful) image-strip retry, and cross-thread TUI toasts for scan-thread failures. It also adds a --mount CLI option that bind-mounts large local directories read-only instead of streaming them file-by-file.

  • Large output pipeline — outputs exceeding STRIX_MAX_TOOL_OUTPUT_CHARS are first compressed via a parallel MapReduce LLM fan-out; single-chunk outputs and compression failures fall back to structure-aware truncation (JSON arrays kept as valid JSON, plain text line-capped), with a final character-count backstop in _wrap_exec_command.
  • Context window bypassContextWindowExceededError (status 400) is now intercepted before the image-strip retry loop, parks the agent as "failed", and propagates correctly for root agents while returning None for child agents.
  • --mount bind-mount supportbuild_mount_targets_info / build_session_entries / StrixDockerSandboxClient are wired together to pass host directories to Docker at container-create time; a pre-flight size check rejects oversized --target directories and directs users to --mount.

Confidence Score: 5/5

Safe to merge; all three feature areas are well-isolated with dedicated tests, and the fallback chains preserve session correctness.

The truncation/compression pipeline has a solid backstop so oversized outputs cannot bypass the character cap. The ContextWindowExceededError handler is tested end-to-end including the non-retry assertion. The bind-mount wiring is straightforward. The two issues noted are a misleading log message in non-interactive mode and a deprecated asyncio.wait_for(coroutine) call pattern — neither affects correctness.

No files require special attention; strix/core/execution.py has a misleading log message and strix/core/mapreduce_output.py has the deprecated asyncio.wait_for(coroutine) call, but neither introduces wrong behavior.

Important Files Changed

Filename Overview
strix/core/large_output.py New module: structure-aware truncation for oversized tool outputs. Halving loop, SDK header preservation, and text/JSON branching all look correct. Single-oversized-record edge case (noted in previous review) is handled by caller backstop.
strix/core/mapreduce_output.py New MapReduce LLM compression module. Fan-out to parallel litellm calls is sound; single-chunk fast-path correctly delegates truncation to caller. asyncio.wait_for receives a bare coroutine instead of a Task (deprecated in Python 3.12+). Broad except Exception in _summarise is intentional for chunk-level resilience.
strix/agents/factory.py Added MapReduce compression + truncation backstop inside _wrap_exec_command. Fallback chain (compress → backstop truncate) is correct; load_settings() called inside the hot tool-invocation path, which is fine if settings are cached by the config layer.
strix/core/execution.py ContextWindowExceededError handler correctly bypasses image-strip retry. Log message "parking as failed" fires before the if not interactive: raise guard and is factually wrong in non-interactive mode.
strix/interface/tui/app.py Scan error toast via call_from_thread with contextlib.suppress is the correct cross-thread notification pattern. Removed explicit timeout=0 (which caused immediate expiration) and uses Textual's default timeout instead.
strix/interface/utils.py New helpers: build_mount_targets_info, dedupe_local_targets, find_oversized_local_targets, directory_size_bytes. Deduplication prefers bind-mounted entries; path resolution in build_mount_targets_info is consistent. Size walk is stat-only and best-effort as documented.
strix/runtime/session_manager.py Refactored to build_session_entries which splits local sources into copied entries and bind-mount specs. Logic is correct; mounted paths are excluded from SDK manifest (preventing file-by-file copy).
strix/runtime/docker_client.py Added strix_bind_mounts class attribute (now None not [] per prior review) and Docker mount injection in _create_container. Pyright suppression comments added for private SDK imports.
strix/runtime/backends.py Passes bind_mounts through from session manager to Docker backend. Straightforward wiring change.
strix/config/settings.py Added max_local_copy_mb and max_tool_output_chars settings with documented env-var aliases and defaults. Zero-disables convention is consistent across both fields.
strix/interface/main.py Added --mount CLI argument with correct action="append", error messages updated, pre-flight size check wired in. dedupe_local_targets called after both --target and --mount lists are assembled.

Reviews (2): Last reviewed commit: "Fix review comment" | Re-trigger Greptile

Comment thread strix/runtime/docker_client.py Outdated
@mhspektr

Copy link
Copy Markdown
Contributor Author
@mhspektr

Copy link
Copy Markdown
Contributor Author

I suggest merging #577, updating this branch, and rerunning greptile before reviewing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant