Skip to content

Conversation

@symbiosis-institute
Copy link

Summary

Fixes the bug where the "I'm thinking..." spinner gets stuck indefinitely when SSE streams or WebSocket events fail to complete.

Root Cause

  • Chat mode: SSE stream reading in chat.service.ts uses a while(true) loop that can hang if the server never sends a done or error event
  • Agent mode: Waits for COMPLETE/ERROR WebSocket events that may never arrive
  • Without a timeout, setLoading(false) is never called, causing the spinner to spin forever

Changes

  • chat.service.ts: Add timeoutMs option (default 2 minutes) to streamQuery
    • Timeout aborts the fetch and sends an error event
    • Clear timeout in finally block to prevent memory leaks
  • use-chat-transport.tsx: Show toast errors for timeout and other errors
  • use-app-events.tsx: Add 5-minute timeout safety for agent mode loading state
  • typings/chat.ts: Add timeoutMs to ChatStreamOptions interface

Testing

  • Verified TypeScript compiles without errors
  • Manual testing: spinner should now clear within timeout even if backend hangs

Evidence

  • Typecheck: PASSED (no errors)
  • Files changed: 4 files, 53 insertions(+), 3 deletions(-)
mdear and others added 13 commits December 24, 2025 05:40
- Add DockerSandbox provider for air-gapped/local deployments
- Add PortPoolManager for centralized port allocation (30000-30999)
- Add LocalStorage providers for ii_agent and ii_tool
- Add MCP tool image processing from sandbox containers
- Add storage factory functions with local/GCS support
- Add test suite (143 tests passing)
- Fix connect() to register ports preventing conflicts on reconnect
- Fix delete() to cleanup orphaned volumes
- Update docs with port management and local sandbox setup
Chat file handling:
- Fix file_search filtering by user_id only (not session_id) for cross-session access
- Add SHA-256 content hash deduplication in OpenAI vector store
- Reduce file_search max results to 3 to prevent context overflow
- Add file corpus discovery so AI knows which files are searchable
- Fix reasoning.effort parameter only sent to reasoning models
- Add hasattr guard for text attribute on image-only messages

Sandbox management:
- Add orphan cleanup loop (5min interval) to remove containers without active sessions
- Add /internal/sandboxes/{id}/has-active-session endpoint for session verification
- Add port_manager.scan_existing_containers() to recover state on restart
- Add LOCAL_MODE config with orphan cleanup settings

Resource limits:
- Add MAX_TABS=20 limit in browser with force-close of oldest tabs
- Add MAX_SHELL_SESSIONS=10 limit in shell tool

Tests: Add 248 unit tests covering all changes
## New Features
- expose_port(external) parameter: external=True returns localhost:port for browser access,
  external=False returns internal Docker IP for container-to-container communication
- LLMConfig.get_max_output_tokens(): Model-specific output token limits
  (64K Claude 4, 100K o1, 16K GPT-4, 8K Gemini)
- Browser MAX_TABS=20 limit with automatic cleanup of oldest tabs
- Shell session MAX_SHELL_SESSIONS=15 limit with clear error messages
- Anthropic native thinking blocks support via beta endpoint
- Extended context (1M tokens) support for Claude models

## Frontend Improvements
- Added selectIsStopped selector for proper stopped state UI handling
- Fixed agent task state transitions for cancelled sessions
- Improved subagent container with session awareness

## New Test Coverage (343 tests total)
- tests/llm/test_llm_config.py: LLMConfig.get_max_output_tokens() tests
- tests/tools/test_browser_tab_limit.py: Browser MAX_TABS enforcement
- tests/tools/test_resource_limits.py: Browser and shell session limits
- tests/tools/test_generation_config_factory.py: Image/video generation configs
- tests/tools/test_openai_dalle.py: DALL-E 3 image generation client
- tests/tools/test_openai_sora.py: Sora video generation client
- tests/storage/test_local_storage.py: LocalStorage.get_permanent_url()
- tests/storage/test_tool_local_storage.py: Tool server LocalStorage

## Code Quality
- Removed debug print statements from anthropic.py
- Removed trailing whitespace from all files
- Fixed test assertions to match implementation behavior
…am PR172

Combines:
- Upstream PR172 (local-docker-sandbox from mdear)
- Our UX fixes: Google GSI guard, login route crash fix, dev autologin

Changes:
- frontend: Add Google GSI guard (don't init without client_id; skip when dev autologin enabled)
- frontend: Fix /login route crash when Google OAuth is disabled
- docker: Pass VITE_DEV_AUTH_AUTOLOGIN via local-only compose build args
- test: Add Playwright smoke that fails on runtime/console errors
- chore: Ignore Playwright test artifacts (playwright-report/, test-results/)

Gates:
- Backend tests: 52/52 PASSED
- API health: 200/200
- Playwright smoke: 4/4 PASSED
- codex-mcp: PASS (with notes for future improvements)
- gemini-cli: Reviewed (minor findings noted)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…itespace guard

Fix A - Dev auto-login resilience:
- Add 10-second timeout with AbortController
- Proper cleanup of isAutoLoggingIn on timeout
- Clear timeout in all code paths (success/error/abort)

Fix B - Safer dev-auth defaults:
- Change DEV_AUTH_ENABLED from hardcoded "true" to ${DEV_AUTH_ENABLED:-false}
- Add prominent security warning in .stack.env.local.example
- Dev auth is now OPT-IN only

Fix C - Whitespace client_id guard:
- Add .trim() to VITE_GOOGLE_CLIENT_ID in provider.tsx
- Align googleEnabled logic in login.tsx with trimmed value

Gates:
- Backend tests: 52/52 PASSED
- API health: 200/200
- Playwright smoke: 4/4 PASSED
- codex-mcp: PASS (all findings resolved)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…enai worker

This change enables ii-agent local Docker stack to work with OpenAI-compatible
LLM workers like gemini-cli-openai (GewoonJaap/gemini-cli-openai) running on
the host machine.

Changes:
- Add OPENAI_BASE_URL env var documentation to .stack.env.local.example
- Add chat smoke test (frontend/e2e/chat-smoke.spec.ts) with mocked SSE for
  deterministic testing; supports real provider testing via E2E_REAL_LLM=1
- Add API smoke script (scripts/smoke-openai-base-url.sh) to validate
  /v1/models and /v1/chat/completions endpoints

The backend already supports base_url via LLMConfig.base_url, so no backend
changes were needed. The smoke tests provide coverage for the chat flow with
both mocked and real LLM providers.

QA:
- Backend tests: 52/52 PASSED
- Playwright tests: 5/5 PASSED (including chat-smoke)
- Codex MCP: P2 issue fixed (console error filtering logic)
- Gemini CLI: No blocking issues

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change pyproject.toml: ddgs>=9.9.1 → duckduckgo-search>=8.1.1
- The 'ddgs' package name was incorrect; actual PyPI package is 'duckduckgo-search'
- Import statements already updated: from duckduckgo_search import DDGS
- This fixes P1 issue identified by codex-mcp QA gate

QA: all 4 import files verified working with uv run pytest

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add SSE stream consumption logic in OpenAI provider agenerate()
  * gemini-cli-openai worker returns SSE by default
  * Use stream=True and consume synchronously with list+join for O(n) performance
  * Build synthetic response object compatible with non-streaming interface

- Security: Add .gitignore for auth artifacts and test results
  * frontend/e2e/.auth/ (contains session tokens)
  * playwright-report/, test-results/

- Performance: Refactor all SSE consumers to use list+join
  * agenerate() - agent mode SSE consumption
  * Streaming chat consumers

- Tests: Add backend unit tests for SSE stream consumption
  * tests/llm/test_sse_stream_consumption.py (7 tests)
  * Cover: multi-chunk, tool calls, finish reason, list+join pattern

- E2E: Add agent mode smoke test with REAL LLM
  * Requires E2E_REAL_LLM=1 (worker integration validated)
  * Backend unit tests cover SSE logic, mocked E2E not needed

Test Results:
- Backend: 59 tests PASSED (including 7 new SSE tests)
- API smoke: PASSED
- Playwright: 2/2 PASSED (chat + agent mode with REAL LLM)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add timeout mechanisms to prevent "I'm thinking..." spinner from hanging
indefinitely when SSE streams or WebSocket events fail to complete.

Changes:
- chat.service.ts: Add timeoutMs option (default 2 minutes) to streamQuery
  - Timeout aborts the fetch and sends an error event
  - Clear timeout in finally block to prevent memory leaks
- use-chat-transport.tsx: Show toast errors for timeout and other errors
- use-app-events.tsx: Add 5-minute timeout safety for agent mode loading state
- typings/chat.ts: Add timeoutMs to ChatStreamOptions interface

Root cause:
- SSE stream reading in chat.service.ts uses while(true) loop that can hang
- Agent mode waits for COMPLETE/ERROR WebSocket events that may never arrive
- Without timeout, setLoading(false) is never called and spinner spins forever

This fix ensures the UI always terminates with success or a surfaced error
within a bounded time.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants