feat(agent): live "Thinking…" indicator for reasoning models#98
Open
Mariomarquezt wants to merge 8 commits into
Open
feat(agent): live "Thinking…" indicator for reasoning models#98Mariomarquezt wants to merge 8 commits into
Mariomarquezt wants to merge 8 commits into
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…olish - Add normalizeOpenAiBaseUrl() to chatCompletions.ts that strips trailing slashes and an optional trailing /v1 segment, preventing the /v1/v1/ double-append footgun when users paste provider-documented URLs. - Use normalizeOpenAiBaseUrl in makeChatCompletionsAdapter (endpoint) and fetchOpenAiCompatibleModels (/v1/models fetch); drop the now-unused trimSlash import from openaiCompatible.ts. - Remove redundant 'as AiProviderId' cast (M4); drop the unused import. - Add normalizeOpenAiBaseUrl test coverage in chatCompletions.test.ts and a /v1-suffixed base-URL normalization case in openaiCompatible.test.ts. - Update AiAuthMode baseUrl JSDoc to reflect Ollama + openai-compatible (M1). - Add OpenAI-Compatible to contextTokens.ts comment for parity (M3). - Update ProvidersTab base-URL placeholder to https://api.groq.com/openai/v1 so the UI matches the now-correct /v1-inclusive provider-documented form. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0115W5vEDNwsWeaeS5PyTFgG
The id stays 'openai-compatible' (stable registry/DB identifier); only the user-facing display label changes — dropdown, credential card, driver label, and docs. Protocol descriptions and the filename are unchanged.
Real OpenAI-compatible gateways (OpenCode Zen, OpenRouter, vLLM, …) send explicit `null` for optional per-chunk fields (`usage: null`, `tool_calls: null`, `delta.content: null`) on every chunk. The chunk schema used Type.Optional, which accepts absent-or-value but not null, so parseValue threw, the frame was dropped in translate()'s catch, and the model's entire reply silently vanished — reasoning models (GLM, DeepSeek, Qwen, MiniMax) appeared to 'not reply'. Wrap the optional fields in a nullable() helper so both absent and null validate. Verified against real gateway frames.
Reasoning models reached over the chat/completions wire stream their
chain-of-thought in delta.reasoning_content / delta.reasoning while the answer
stays empty — the panel looked frozen for seconds. Add an ephemeral reasoning
stream event: the chat/completions translator emits { type: 'reasoning' }
(never added to the assistant message, so it is not persisted or replayed); the
runner forwards it; the agent store accumulates it (rAF-batched, session-only)
into AgentMessage.reasoning; the panel shows an animated 'Thinking…' indicator
while reasoning streams with no answer yet, then an on-demand 'Show reasoning'
expander. No DB/schema changes. Reasoning UI extracted to MessageReasoning to
keep AgentPanel under the module-size ceiling.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
Reasoning models reached over the chat/completions wire stream their chain-of-thought in
delta.reasoning_content(ordelta.reasoningon some gateways) while the visible answer (delta.content) stays empty for several seconds. The agent panel showed nothing during that window — it looked frozen. This surfaces a lightweight, ephemeral "Thinking…" indicator plus an on-demand "Show reasoning" expander.Approach
{ type: 'reasoning'; text }inAiStreamEvent.delta.reasoning_content ?? delta.reasoning— never appended to the assistant message, so reasoning is not persisted and never replayed to the provider.AgentMessage.reasoning.AgentPanelshows an animated "Thinking…" indicator while a turn reasons with no answer yet, then a collapsed "Show reasoning" expander once the answer arrives. UI extracted to a smallMessageReasoningcomponent. No DB changes.Stacking note
This builds on #97 (it depends on the shared
chatCompletions.tsadapter introduced there). Until #97 merges, this PR's diff also shows the provider commits. Happy to rebase ontomainonce #97 lands so this becomes a clean, isolated diff.User / developer impact
Reasoning models (and any chat/completions model that emits reasoning) no longer appear idle; users see live "thinking" feedback and can inspect the reasoning on demand. Non-reasoning models are unaffected (no indicator shown).
Verification
bun test— translator tests (reasoning emitted for both field names; never leaks into the answer) + stream-reducer routing test.bun run buildandbun run lint— green.Disclosure
Authored with Claude Code (Claude Opus 4.8), reviewed and live-tested by the submitter. Harness: Claude Code.