feat(agent): live "Thinking…" indicator for reasoning models by Mariomarquezt · Pull Request #98 · CoreBunch/Instatic

Mariomarquezt · 2026-06-28T06:08:39Z

What & why

Reasoning models reached over the chat/completions wire stream their chain-of-thought in delta.reasoning_content (or delta.reasoning on some gateways) while the visible answer (delta.content) stays empty for several seconds. The agent panel showed nothing during that window — it looked frozen. This surfaces a lightweight, ephemeral "Thinking…" indicator plus an on-demand "Show reasoning" expander.

Approach

New ephemeral stream event { type: 'reasoning'; text } in AiStreamEvent.
The chat/completions translator emits it from delta.reasoning_content ?? delta.reasoning — never appended to the assistant message, so reasoning is not persisted and never replayed to the provider.
The runner forwards it (no persistence). The agent store accumulates it (rAF-batched, like text) into a session-only AgentMessage.reasoning.
AgentPanel shows an animated "Thinking…" indicator while a turn reasons with no answer yet, then a collapsed "Show reasoning" expander once the answer arrives. UI extracted to a small MessageReasoning component. No DB changes.

Stacking note

This builds on #97 (it depends on the shared chatCompletions.ts adapter introduced there). Until #97 merges, this PR's diff also shows the provider commits. Happy to rebase onto main once #97 lands so this becomes a clean, isolated diff.

User / developer impact

Reasoning models (and any chat/completions model that emits reasoning) no longer appear idle; users see live "thinking" feedback and can inspect the reasoning on demand. Non-reasoning models are unaffected (no indicator shown).

Verification

bun test — translator tests (reasoning emitted for both field names; never leaks into the answer) + stream-reducer routing test.
bun run build and bun run lint — green.
Live-tested against reasoning models on the OpenCode Zen gateway.

Disclosure

Authored with Claude Code (Claude Opus 4.8), reviewed and live-tested by the submitter. Harness: Claude Code.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…olish - Add normalizeOpenAiBaseUrl() to chatCompletions.ts that strips trailing slashes and an optional trailing /v1 segment, preventing the /v1/v1/ double-append footgun when users paste provider-documented URLs. - Use normalizeOpenAiBaseUrl in makeChatCompletionsAdapter (endpoint) and fetchOpenAiCompatibleModels (/v1/models fetch); drop the now-unused trimSlash import from openaiCompatible.ts. - Remove redundant 'as AiProviderId' cast (M4); drop the unused import. - Add normalizeOpenAiBaseUrl test coverage in chatCompletions.test.ts and a /v1-suffixed base-URL normalization case in openaiCompatible.test.ts. - Update AiAuthMode baseUrl JSDoc to reflect Ollama + openai-compatible (M1). - Add OpenAI-Compatible to contextTokens.ts comment for parity (M3). - Update ProvidersTab base-URL placeholder to https://api.groq.com/openai/v1 so the UI matches the now-correct /v1-inclusive provider-documented form. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0115W5vEDNwsWeaeS5PyTFgG

The id stays 'openai-compatible' (stable registry/DB identifier); only the user-facing display label changes — dropdown, credential card, driver label, and docs. Protocol descriptions and the filename are unchanged.

Real OpenAI-compatible gateways (OpenCode Zen, OpenRouter, vLLM, …) send explicit `null` for optional per-chunk fields (`usage: null`, `tool_calls: null`, `delta.content: null`) on every chunk. The chunk schema used Type.Optional, which accepts absent-or-value but not null, so parseValue threw, the frame was dropped in translate()'s catch, and the model's entire reply silently vanished — reasoning models (GLM, DeepSeek, Qwen, MiniMax) appeared to 'not reply'. Wrap the optional fields in a nullable() helper so both absent and null validate. Verified against real gateway frames.

Reasoning models reached over the chat/completions wire stream their chain-of-thought in delta.reasoning_content / delta.reasoning while the answer stays empty — the panel looked frozen for seconds. Add an ephemeral reasoning stream event: the chat/completions translator emits { type: 'reasoning' } (never added to the assistant message, so it is not persisted or replayed); the runner forwards it; the agent store accumulates it (rAF-batched, session-only) into AgentMessage.reasoning; the panel shows an animated 'Thinking…' indicator while reasoning streams with no answer yet, then an on-demand 'Show reasoning' expander. No DB/schema changes. Reasoning UI extracted to MessageReasoning to keep AgentPanel under the module-size ceiling.

Mario and others added 8 commits June 27, 2026 17:51

refactor(ai): extract shared chat/completions adapter from ollama driver

d1952c2

feat(ai): add openai-compatible provider driver (custom base URL)

19d02f5

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

feat(ai): expose openai-compatible provider in the admin providers UI

533c293

docs(ai): document the openai-compatible provider

c57ee74

feat(ai): rename custom provider display label to 'Custom Provider'

8aac024

The id stays 'openai-compatible' (stable registry/DB identifier); only the user-facing display label changes — dropdown, credential card, driver label, and docs. Protocol descriptions and the filename are unchanged.

Mariomarquezt marked this pull request as ready for review June 28, 2026 06:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(agent): live "Thinking…" indicator for reasoning models#98

feat(agent): live "Thinking…" indicator for reasoning models#98
Mariomarquezt wants to merge 8 commits into
CoreBunch:mainfrom
Mariomarquezt:feat/agent-reasoning-display

Mariomarquezt commented Jun 28, 2026

Labels

1 participant