Skip to content

feat(swarm): record provider/model on SwarmRun for cost audit + debug#92

Merged
warren618 merged 1 commit into
HKUDS:mainfrom
Teerapat-Vatpitak:feat/swarm-record-provider-model
May 10, 2026
Merged

feat(swarm): record provider/model on SwarmRun for cost audit + debug#92
warren618 merged 1 commit into
HKUDS:mainfrom
Teerapat-Vatpitak:feat/swarm-record-provider-model

Conversation

@Teerapat-Vatpitak

Copy link
Copy Markdown
Contributor

Summary

  • Adds two optional fields to SwarmRunprovider and model — and populates them from the active LANGCHAIN_PROVIDER / LANGCHAIN_MODEL_NAME env vars at run-creation time.
  • Closes a small but useful observability gap: every .swarm/runs/<id>/run.json written from this point forward records which LLM the run was actually executed against.
  • Backward-compatible: existing on-disk run.json files lack the keys and continue to deserialise (default None).

Why

run.json currently has no field describing which provider / model the swarm was launched against. SwarmAgentSpec.model_name carries per-agent overrides but defaults to None ("use the global config"), and that resolved config is never captured anywhere.

This makes three things harder than they should be:

  1. Cost auditing — the persisted total_input_tokens / total_output_tokens are meaningless without knowing the model's price-per-token.
  2. Side-by-side experiments — comparing kimi-k2.5 vs gpt-4o on the same preset requires manually correlating timestamps with .env history.
  3. Post-hoc bug triage — "this run produced garbage" is much faster to diagnose if the run state shows it ran on the wrong model.

Changes

agent/src/swarm/models.py

Two Optional[str] fields on SwarmRun, both defaulting to None:

provider: str | None = None  # lowercased LANGCHAIN_PROVIDER
model: str | None = None     # raw LANGCHAIN_MODEL_NAME (case-sensitive)

The docstring spells out the scoping rule: this is the run-level default; per-agent overrides remain visible on SwarmAgentSpec.model_name.

agent/src/swarm/runtime.py

In SwarmRuntime.start_run, read the same env vars src/providers/llm.py:136,195 reads when constructing ChatLLM, and assign them on the run before _store.create_run persists run.json. Empty / unset env produces None, not the empty string.

run.provider = (os.getenv("LANGCHAIN_PROVIDER") or "").strip().lower() or None
run.model = (os.getenv("LANGCHAIN_MODEL_NAME") or "").strip() or None

agent/tests/test_swarm_run_metadata.py (new)

Six unit tests covering the new behaviour:

  • Construction + JSON round-trip with explicit values
  • Default-to-None behaviour
  • Backward compat: legacy run.json (no provider / model keys) still parses, untouched fields preserved
  • Parametrised acceptance of multiple provider names (anthropic, deepseek, openrouter)

Out of scope

  • Per-call (per-iteration) model tracking — workers can already override via SwarmAgentSpec.model_name. That field is unchanged.
  • Real token counting — worker.py:_estimate_tokens is still a character-count heuristic. That's a separate gap worth addressing but it doesn't depend on this PR.
  • Frontend display — the new fields are now in run.json; surfacing them in the UI is a follow-up.

Test plan

  • pytest agent/tests/test_swarm_run_metadata.py -v — 6/6 pass.
  • pytest --ignore=agent/tests/e2e_backtest --tb=line -q — full sweep: same baseline pass count + 6 new tests, no regressions.
  • pytest agent/tests/test_swarm_preset_inspect.py agent/tests/test_swarm_presets_packaging.py agent/tests/test_models.py -v — adjacent swarm tests still green (17/17).

Checklist

  • No changes to protected areas (agent/src/agent/, agent/src/session/, agent/src/providers/).
  • No hardcoded values; reuses the existing LANGCHAIN_* env-var contract.
  • Code follows CONTRIBUTING.md (Conventional Commit prefix, Google-style docstrings, regression test added).
  • Backward-compatible — existing on-disk run.json parses without modification.
run.json currently has no field describing which LLM provider/model the
swarm was actually launched against — SwarmAgentSpec.model_name carries
per-agent overrides but defaults to None ("use the global config"), and
that resolved config is never captured anywhere. Cost auditing,
side-by-side provider experiments, and post-hoc bug triage all need this
information; today it has to be reconstructed from .env timestamps.

This adds two optional fields to SwarmRun:

- provider: lowercase form of LANGCHAIN_PROVIDER, the same env var
  src/providers/llm.py:136,195 reads when constructing ChatLLM.
- model: LANGCHAIN_MODEL_NAME (raw, since model strings are
  case-sensitive — "gpt-4o" vs "openai/gpt-5").

Both default to None and are populated in SwarmRuntime.start_run before
the run is persisted (runtime.py:start_run), so every new run.json from
this point forward records the active provider/model. Per-agent
overrides remain visible on SwarmAgentSpec.model_name; this field is the
run-level default.

Backward compat: existing run.json files lack the keys entirely. Pydantic
deserializes missing optional fields to their default (None), so
SwarmStore.get_run / list_runs continue to work on legacy files. A
parametrized regression test covers this in
agent/tests/test_swarm_run_metadata.py along with construction,
serialization round-trip, and acceptance of multiple provider names.
@warren618 warren618 merged commit e1a62d8 into HKUDS:main May 10, 2026
1 check passed
@warren618

Copy link
Copy Markdown
Collaborator

Thanks for adding the run-level provider/model metadata. I verified the new serialization/backward-compatibility tests in a combined merge gate, and the change is small and useful for later cost/debug auditing. Merged.

@Teerapat-Vatpitak Teerapat-Vatpitak deleted the feat/swarm-record-provider-model branch May 13, 2026 03:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants