Fix memory compressor timeout fallback by RitwijParmar · Pull Request #498 · usestrix/strix

RitwijParmar · 2026-05-26T15:10:11Z

Summary

disable LiteLLM retry amplification for memory-compressor summarization calls
add a local extractive fallback summary when compressor LLM calls fail or time out
preserve recent scan messages while avoiding long-scan stalls caused by repeated summarization failures

Why

This addresses the reliability failure mode in #470 where long-context scans can get stuck when the memory compressor repeatedly hits LiteLLM timeouts. The fallback keeps the scan moving while retaining ordered message previews and recent operational context.

Tests

PYTEST_ADDOPTS='' .venv/bin/python -m pytest -o addopts='' tests/llm/test_memory_compressor.py

greptile-apps · 2026-05-26T15:12:13Z

Greptile Summary

This PR fixes a reliability failure in long-context scans where repeated LiteLLM summarization timeouts could stall the memory compressor. It adds num_retries=0 to prevent retry amplification and introduces _build_fallback_summary, an extractive local summarizer that preserves head/tail message previews when the LLM call fails.

num_retries=0 is passed to litellm.completion so a single timeout does not silently trigger multiple retries.
_build_fallback_summary returns a context_summary message with up to 12 sampled previews (head + tail) when the LLM raises any exception; the existing exception handler now delegates to it instead of returning messages[0].
The empty-response branch (if not summary.strip(): return messages[0]) was not updated to use the new fallback, leaving one failure mode — an LLM that responds with blank content — still returning a raw uncompressed old message.

Confidence Score: 3/5

The change is almost complete but one branch was not updated consistently with the rest.

The empty-summary guard at line 178 still returns messages[0] directly, which is exactly the raw-old-message return the PR is designed to eliminate. An LLM that responds with an empty string triggers this path and re-creates the context inflation the fix targets. The rest of the change — disabling retries and the fallback builder — is correct and well-tested.

strix/llm/memory_compressor.py line 178 needs the same _build_fallback_summary treatment applied to the exception handler above it.

Important Files Changed

Filename	Overview
strix/llm/memory_compressor.py	Adds `num_retries=0` to suppress LiteLLM retry amplification and introduces `_build_fallback_summary` as an extractive local fallback on exception; the empty-LLM-response branch at line 178 still returns `messages[0]` (a raw old message) instead of the fallback, leaving one failure mode unaddressed.
tests/llm/test_memory_compressor.py	New test file covering retry-disable, timeout fallback, and compress_history fallback path; the empty-response branch (`not summary.strip()`) is not exercised so the surviving `messages[0]` return goes undetected.

Comments Outside Diff (1)

strix/llm/memory_compressor.py, line 177-178 (link)

When the LLM returns an empty or whitespace-only response, the code still returns messages[0] — a raw, potentially large old message — rather than the new fallback. This bypasses the core fix this PR introduces: the PR prevents raw-message returns on exceptions/timeouts but leaves this branch unaddressed. An LLM that responds with an empty string (a real failure mode) would still push an uncompressed old message back into the conversation, re-creating the stall.

Prompt To Fix With AI

This is a comment left during a code review.
Path: strix/llm/memory_compressor.py
Line: 177-178

Comment:
When the LLM returns an empty or whitespace-only response, the code still returns `messages[0]` — a raw, potentially large old message — rather than the new fallback. This bypasses the core fix this PR introduces: the PR prevents raw-message returns on exceptions/timeouts but leaves this branch unaddressed. An LLM that responds with an empty string (a real failure mode) would still push an uncompressed old message back into the conversation, re-creating the stall.



How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
strix/llm/memory_compressor.py:177-178
When the LLM returns an empty or whitespace-only response, the code still returns `messages[0]` — a raw, potentially large old message — rather than the new fallback. This bypasses the core fix this PR introduces: the PR prevents raw-message returns on exceptions/timeouts but leaves this branch unaddressed. An LLM that responds with an empty string (a real failure mode) would still push an uncompressed old message back into the conversation, re-creating the stall.

```suggestion
        if not summary.strip():
            return _build_fallback_summary(messages)
```

_{Reviews (1): Last reviewed commit: "fix memory compressor timeout fallback" | Re-trigger Greptile}

fix memory compressor timeout fallback

08c3096

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix memory compressor timeout fallback#498

Fix memory compressor timeout fallback#498
RitwijParmar wants to merge 1 commit into
usestrix:mainfrom
RitwijParmar:ritwij/memory-compressor-timeout-fallback

RitwijParmar commented May 26, 2026

greptile-apps Bot commented May 26, 2026 •

edited

Loading

Comments Outside Diff (1)

Labels

1 participant

Uh oh!

Conversation

RitwijParmar commented May 26, 2026

Summary

Why

Tests

greptile-apps Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Comments Outside Diff (1)

Labels

1 participant

greptile-apps Bot commented May 26, 2026 •

edited

Loading