-
Notifications
You must be signed in to change notification settings - Fork 471
fix(ui): prevent infinite thinking spinner and surface request failures #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
symbiosis-institute
wants to merge
13
commits into
Intelligent-Internet:develop
Choose a base branch
from
symbiosis-institute:fix/ui-thinking-hang
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
fix(ui): prevent infinite thinking spinner and surface request failures #174
symbiosis-institute
wants to merge
13
commits into
Intelligent-Internet:develop
from
symbiosis-institute:fix/ui-thinking-hang
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add DockerSandbox provider for air-gapped/local deployments - Add PortPoolManager for centralized port allocation (30000-30999) - Add LocalStorage providers for ii_agent and ii_tool - Add MCP tool image processing from sandbox containers - Add storage factory functions with local/GCS support - Add test suite (143 tests passing) - Fix connect() to register ports preventing conflicts on reconnect - Fix delete() to cleanup orphaned volumes - Update docs with port management and local sandbox setup
Chat file handling:
- Fix file_search filtering by user_id only (not session_id) for cross-session access
- Add SHA-256 content hash deduplication in OpenAI vector store
- Reduce file_search max results to 3 to prevent context overflow
- Add file corpus discovery so AI knows which files are searchable
- Fix reasoning.effort parameter only sent to reasoning models
- Add hasattr guard for text attribute on image-only messages
Sandbox management:
- Add orphan cleanup loop (5min interval) to remove containers without active sessions
- Add /internal/sandboxes/{id}/has-active-session endpoint for session verification
- Add port_manager.scan_existing_containers() to recover state on restart
- Add LOCAL_MODE config with orphan cleanup settings
Resource limits:
- Add MAX_TABS=20 limit in browser with force-close of oldest tabs
- Add MAX_SHELL_SESSIONS=10 limit in shell tool
Tests: Add 248 unit tests covering all changes
## New Features - expose_port(external) parameter: external=True returns localhost:port for browser access, external=False returns internal Docker IP for container-to-container communication - LLMConfig.get_max_output_tokens(): Model-specific output token limits (64K Claude 4, 100K o1, 16K GPT-4, 8K Gemini) - Browser MAX_TABS=20 limit with automatic cleanup of oldest tabs - Shell session MAX_SHELL_SESSIONS=15 limit with clear error messages - Anthropic native thinking blocks support via beta endpoint - Extended context (1M tokens) support for Claude models ## Frontend Improvements - Added selectIsStopped selector for proper stopped state UI handling - Fixed agent task state transitions for cancelled sessions - Improved subagent container with session awareness ## New Test Coverage (343 tests total) - tests/llm/test_llm_config.py: LLMConfig.get_max_output_tokens() tests - tests/tools/test_browser_tab_limit.py: Browser MAX_TABS enforcement - tests/tools/test_resource_limits.py: Browser and shell session limits - tests/tools/test_generation_config_factory.py: Image/video generation configs - tests/tools/test_openai_dalle.py: DALL-E 3 image generation client - tests/tools/test_openai_sora.py: Sora video generation client - tests/storage/test_local_storage.py: LocalStorage.get_permanent_url() - tests/storage/test_tool_local_storage.py: Tool server LocalStorage ## Code Quality - Removed debug print statements from anthropic.py - Removed trailing whitespace from all files - Fixed test assertions to match implementation behavior
…proposed changes.
…am PR172 Combines: - Upstream PR172 (local-docker-sandbox from mdear) - Our UX fixes: Google GSI guard, login route crash fix, dev autologin Changes: - frontend: Add Google GSI guard (don't init without client_id; skip when dev autologin enabled) - frontend: Fix /login route crash when Google OAuth is disabled - docker: Pass VITE_DEV_AUTH_AUTOLOGIN via local-only compose build args - test: Add Playwright smoke that fails on runtime/console errors - chore: Ignore Playwright test artifacts (playwright-report/, test-results/) Gates: - Backend tests: 52/52 PASSED - API health: 200/200 - Playwright smoke: 4/4 PASSED - codex-mcp: PASS (with notes for future improvements) - gemini-cli: Reviewed (minor findings noted) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…itespace guard
Fix A - Dev auto-login resilience:
- Add 10-second timeout with AbortController
- Proper cleanup of isAutoLoggingIn on timeout
- Clear timeout in all code paths (success/error/abort)
Fix B - Safer dev-auth defaults:
- Change DEV_AUTH_ENABLED from hardcoded "true" to ${DEV_AUTH_ENABLED:-false}
- Add prominent security warning in .stack.env.local.example
- Dev auth is now OPT-IN only
Fix C - Whitespace client_id guard:
- Add .trim() to VITE_GOOGLE_CLIENT_ID in provider.tsx
- Align googleEnabled logic in login.tsx with trimmed value
Gates:
- Backend tests: 52/52 PASSED
- API health: 200/200
- Playwright smoke: 4/4 PASSED
- codex-mcp: PASS (all findings resolved)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…enai worker This change enables ii-agent local Docker stack to work with OpenAI-compatible LLM workers like gemini-cli-openai (GewoonJaap/gemini-cli-openai) running on the host machine. Changes: - Add OPENAI_BASE_URL env var documentation to .stack.env.local.example - Add chat smoke test (frontend/e2e/chat-smoke.spec.ts) with mocked SSE for deterministic testing; supports real provider testing via E2E_REAL_LLM=1 - Add API smoke script (scripts/smoke-openai-base-url.sh) to validate /v1/models and /v1/chat/completions endpoints The backend already supports base_url via LLMConfig.base_url, so no backend changes were needed. The smoke tests provide coverage for the chat flow with both mocked and real LLM providers. QA: - Backend tests: 52/52 PASSED - Playwright tests: 5/5 PASSED (including chat-smoke) - Codex MCP: P2 issue fixed (console error filtering logic) - Gemini CLI: No blocking issues 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Change pyproject.toml: ddgs>=9.9.1 → duckduckgo-search>=8.1.1 - The 'ddgs' package name was incorrect; actual PyPI package is 'duckduckgo-search' - Import statements already updated: from duckduckgo_search import DDGS - This fixes P1 issue identified by codex-mcp QA gate QA: all 4 import files verified working with uv run pytest 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add SSE stream consumption logic in OpenAI provider agenerate() * gemini-cli-openai worker returns SSE by default * Use stream=True and consume synchronously with list+join for O(n) performance * Build synthetic response object compatible with non-streaming interface - Security: Add .gitignore for auth artifacts and test results * frontend/e2e/.auth/ (contains session tokens) * playwright-report/, test-results/ - Performance: Refactor all SSE consumers to use list+join * agenerate() - agent mode SSE consumption * Streaming chat consumers - Tests: Add backend unit tests for SSE stream consumption * tests/llm/test_sse_stream_consumption.py (7 tests) * Cover: multi-chunk, tool calls, finish reason, list+join pattern - E2E: Add agent mode smoke test with REAL LLM * Requires E2E_REAL_LLM=1 (worker integration validated) * Backend unit tests cover SSE logic, mocked E2E not needed Test Results: - Backend: 59 tests PASSED (including 7 new SSE tests) - API smoke: PASSED - Playwright: 2/2 PASSED (chat + agent mode with REAL LLM) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add timeout mechanisms to prevent "I'm thinking..." spinner from hanging indefinitely when SSE streams or WebSocket events fail to complete. Changes: - chat.service.ts: Add timeoutMs option (default 2 minutes) to streamQuery - Timeout aborts the fetch and sends an error event - Clear timeout in finally block to prevent memory leaks - use-chat-transport.tsx: Show toast errors for timeout and other errors - use-app-events.tsx: Add 5-minute timeout safety for agent mode loading state - typings/chat.ts: Add timeoutMs to ChatStreamOptions interface Root cause: - SSE stream reading in chat.service.ts uses while(true) loop that can hang - Agent mode waits for COMPLETE/ERROR WebSocket events that may never arrive - Without timeout, setLoading(false) is never called and spinner spins forever This fix ensures the UI always terminates with success or a surfaced error within a bounded time. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the bug where the "I'm thinking..." spinner gets stuck indefinitely when SSE streams or WebSocket events fail to complete.
Root Cause
chat.service.tsuses awhile(true)loop that can hang if the server never sends adoneorerroreventCOMPLETE/ERRORWebSocket events that may never arrivesetLoading(false)is never called, causing the spinner to spin foreverChanges
timeoutMsoption (default 2 minutes) tostreamQueryfinallyblock to prevent memory leakstimeoutMstoChatStreamOptionsinterfaceTesting
Evidence