Skip to content

feat(agent): live "Thinking…" indicator for reasoning models#98

Open
Mariomarquezt wants to merge 8 commits into
CoreBunch:mainfrom
Mariomarquezt:feat/agent-reasoning-display
Open

feat(agent): live "Thinking…" indicator for reasoning models#98
Mariomarquezt wants to merge 8 commits into
CoreBunch:mainfrom
Mariomarquezt:feat/agent-reasoning-display

Conversation

@Mariomarquezt

Copy link
Copy Markdown

What & why

Reasoning models reached over the chat/completions wire stream their chain-of-thought in delta.reasoning_content (or delta.reasoning on some gateways) while the visible answer (delta.content) stays empty for several seconds. The agent panel showed nothing during that window — it looked frozen. This surfaces a lightweight, ephemeral "Thinking…" indicator plus an on-demand "Show reasoning" expander.

Approach

  • New ephemeral stream event { type: 'reasoning'; text } in AiStreamEvent.
  • The chat/completions translator emits it from delta.reasoning_content ?? delta.reasoningnever appended to the assistant message, so reasoning is not persisted and never replayed to the provider.
  • The runner forwards it (no persistence). The agent store accumulates it (rAF-batched, like text) into a session-only AgentMessage.reasoning.
  • AgentPanel shows an animated "Thinking…" indicator while a turn reasons with no answer yet, then a collapsed "Show reasoning" expander once the answer arrives. UI extracted to a small MessageReasoning component. No DB changes.

Stacking note

This builds on #97 (it depends on the shared chatCompletions.ts adapter introduced there). Until #97 merges, this PR's diff also shows the provider commits. Happy to rebase onto main once #97 lands so this becomes a clean, isolated diff.

User / developer impact

Reasoning models (and any chat/completions model that emits reasoning) no longer appear idle; users see live "thinking" feedback and can inspect the reasoning on demand. Non-reasoning models are unaffected (no indicator shown).

Verification

  • bun test — translator tests (reasoning emitted for both field names; never leaks into the answer) + stream-reducer routing test.
  • bun run build and bun run lint — green.
  • Live-tested against reasoning models on the OpenCode Zen gateway.

Disclosure

Authored with Claude Code (Claude Opus 4.8), reviewed and live-tested by the submitter. Harness: Claude Code.

Mario and others added 8 commits June 27, 2026 17:51
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…olish

- Add normalizeOpenAiBaseUrl() to chatCompletions.ts that strips trailing
  slashes and an optional trailing /v1 segment, preventing the /v1/v1/
  double-append footgun when users paste provider-documented URLs.
- Use normalizeOpenAiBaseUrl in makeChatCompletionsAdapter (endpoint) and
  fetchOpenAiCompatibleModels (/v1/models fetch); drop the now-unused
  trimSlash import from openaiCompatible.ts.
- Remove redundant 'as AiProviderId' cast (M4); drop the unused import.
- Add normalizeOpenAiBaseUrl test coverage in chatCompletions.test.ts and
  a /v1-suffixed base-URL normalization case in openaiCompatible.test.ts.
- Update AiAuthMode baseUrl JSDoc to reflect Ollama + openai-compatible (M1).
- Add OpenAI-Compatible to contextTokens.ts comment for parity (M3).
- Update ProvidersTab base-URL placeholder to https://api.groq.com/openai/v1
  so the UI matches the now-correct /v1-inclusive provider-documented form.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_0115W5vEDNwsWeaeS5PyTFgG
The id stays 'openai-compatible' (stable registry/DB identifier); only the
user-facing display label changes — dropdown, credential card, driver label,
and docs. Protocol descriptions and the filename are unchanged.
Real OpenAI-compatible gateways (OpenCode Zen, OpenRouter, vLLM, …) send
explicit `null` for optional per-chunk fields (`usage: null`,
`tool_calls: null`, `delta.content: null`) on every chunk. The chunk schema
used Type.Optional, which accepts absent-or-value but not null, so parseValue
threw, the frame was dropped in translate()'s catch, and the model's entire
reply silently vanished — reasoning models (GLM, DeepSeek, Qwen, MiniMax)
appeared to 'not reply'. Wrap the optional fields in a nullable() helper so
both absent and null validate. Verified against real gateway frames.
Reasoning models reached over the chat/completions wire stream their
chain-of-thought in delta.reasoning_content / delta.reasoning while the answer
stays empty — the panel looked frozen for seconds. Add an ephemeral reasoning
stream event: the chat/completions translator emits { type: 'reasoning' }
(never added to the assistant message, so it is not persisted or replayed); the
runner forwards it; the agent store accumulates it (rAF-batched, session-only)
into AgentMessage.reasoning; the panel shows an animated 'Thinking…' indicator
while reasoning streams with no answer yet, then an on-demand 'Show reasoning'
expander. No DB/schema changes. Reasoning UI extracted to MessageReasoning to
keep AgentPanel under the module-size ceiling.
@Mariomarquezt Mariomarquezt marked this pull request as ready for review June 28, 2026 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant