Executive Summary
Reliability triage for github/gh-aw over the last 24h (Sentry org github, project gh-aw, run §28481991416).
Telemetry is flowing — the spans dataset has fresh data through 2026-06-30T21:56Z. Health is degraded, not down: most spans are ok, and long-running agent spans (up to ~39 min, status ok) are normal. Failures concentrate in three recurring classes: (1) a failed agent invocation with repeated model-call errors, (2) a recurring 60-second timeout producing error spans across many runs, and (3) short-lived gateway request errors.
Two correctness caveats shape this report:
- The Sentry MCP build exposed here has no
search_events and no get_trace_details — all queries used list_events (Sentry query syntax) with client-side aggregation.
list_events renders only a fixed field subset and does not surface custom attributes, so per-workflow-name attribution was not possible from the query side. Findings are reported at transaction granularity (invoke_agent, gateway.request).
Companion datasets errors and logs both returned No results (24h) — see Notes; treat as an observability gap, not proof of zero errors.
Top Reliability Findings
| Priority |
Workflow / Scope |
Problem |
Evidence |
Next Action |
| P1 — broken user-visible behavior |
invoke_agent (agent run) |
Agent invocation failed after repeated model-call errors |
Trace 7570a964...: root invoke_agent error, dur 31,686 ms; 12 gen_ai child spans error clustered at 785–788 ms (max 788.55 ms) + some ok. 2nd trace 46769f6b...: invoke_agent 77 ms error + 2 gen_ai error |
Pull provider error/status on the 786 ms gen_ai spans; confirm whether retry/back-off or a 4xx/5xx hard-fail; verify gh-aw.run.status is ERROR for these runs |
| P2 — timeouts |
gateway.request → POST /mcp/agenticworkflows |
Recurring 60-second timeout producing error spans |
23 default-op error spans pinned at ~59.99 s (max 59,996 ms) across ~13 distinct traces (dff68ea...×4, 34d005f6...×3, 4072957004ae×3, d17d3c3c...×2, a52e1349...×2, a8526a4d...×2, b7ac182e...×2, + singles). In dff68ea... parent http.server gateway.request reports ok at 60.0 s while child default spans error — status-propagation gap |
Locate the 60 s deadline on the agenticworkflows MCP path; raise/handle it and propagate child error to the parent span status |
| P3 — transport errors |
gateway.request (HTTP layer) |
Short-lived gateway request failures |
default-op error spans ~7 ms–0.9 s across many traces (0717caf0..., 8534bd3d..., 6017ae24..., c2a81c7a..., 85a44aec..., a070c915...) |
Classify by HTTP status / client-disconnect; low severity unless rate climbs |
| P4 — instrumentation / correlation |
project-wide |
Release & truncation correlation not usable from query side |
find_releases → No releases despite service.version being emitted (resource attr → Sentry release). has:gen_ai.response.finish_reasons → 0 though emitted on the conclusion span (send_otlp_span.cjs:2146). gh_aw.workflow_name absent — correct key is gh-aw.workflow.name (present) |
Register releases / map service.version→release; verify finish-reasons indexing; report truncation as inconclusive until queryable |
Representative Traces
View representative traces (continuity verified)
P1 — failed agent invocation · 7570a964f4c8045176ccd886c805ef1a
- Root
invoke_agent error, 31,686 ms. Children: 12× gen_ai error @ ~785–788 ms (tight cluster ⇒ same repeated failure), plus gen_ai ok @ 23,425 ms / 7,508 ms. Parent→child lineage intact under one invoke_agent transaction.
P2 — 60 s timeout · dff68ea53f6e74d5bea86470a277acf9
- Transaction
gateway.request → POST /mcp/agenticworkflows. 5× http.server @ ~60,013 ms (ok), 4× gen_ai @ ~60,002 ms (ok), 4× default @ ~59,993 ms (error), plus gh-aw.activation.setup @ 11,618 ms / 2,014 ms. Continuity intact; the timeout failure is visible only on the default children, not the parent.
P3 — gateway transport error · 0717caf0b117d7a7b142def7f78ab8f6
gateway.request default-op span error @ 6.7 ms — fast-fail at the HTTP layer.
Recommendations
- Surface the agent-run failure cause (smallest first). Inspect the 786 ms
gen_ai error spans in trace 7570a964... for the provider status/message; the uniform duration strongly implies one repeated error (rate-limit/4xx) rather than diverse failures. Confirm gh-aw.run.status=failure is recorded so these are dashboard-visible.
- Fix the 60 s timeout + status propagation on the agenticworkflows MCP path. The deadline recurs across ~13 traces. Even if the limit stays, propagate the child
default error up to the parent gateway.request span so the parent is not reported ok at 60.0 s (currently it is).
- Make release correlation usable.
service.version is emitted (send_otlp_span.cjs:360) but find_releases is empty — register releases or fix the service.version→Sentry-release mapping so regressions can be compared across versions.
- Close the truncation/observability blind spot.
gen_ai.response.finish_reasons is emitted (send_otlp_span.cjs:2146) yet not queryable here, and errors/logs datasets are empty — verify ingestion/indexing so runaway-token and export failures are detectable rather than inconclusive.
Notes
View notes — missing telemetry, ambiguous fields, tool limits
- MCP build limitations:
search_events and get_trace_details are not available; used list_events + client-side aggregation. list_events caps results (~100/query) and renders a fixed field set, so counts are from sampled queries and per-workflow-name attribution was not possible from the query side.
errors dataset: No results (24h) — no Sentry Issues/error events ingested for gh-aw. Reliability signal here rests entirely on the spans dataset.
logs dataset: No results (24h).
- Attribute presence (verified via
has: on spans): span.status ✅ · gh-aw.workflow.name ✅ (note: gh_aw.workflow_name ❌ — naming mismatch) · gh-aw.run.status ✅ · release ✅ (but no registered Releases) · service.version not queryable as a span attr (resource→release) · gen_ai.response.finish_reasons ❌ via has: despite being emitted.
- Truncation / runaway tokens: Inconclusive —
gen_ai.response.finish_reasons:length returned no results, but the attribute is not queryable in this build, so neither presence nor absence of truncation is confirmed.
- Cancellations: none explicitly observed; the 60 s pattern presents as
error timeouts, not cancelled.
- Long spans:
gen_ai spans up to ~2,353 s (~39 min) are status ok and treated as normal agent execution, not failures.
- Emit-side cross-checks against
actions/setup/js/send_otlp_span.cjs: workflow id gh-aw.workflow.name (1297/2068), gh-aw.run.status (2076), OTLP status.code/message ERROR=2 (301–333/2049), finish-reasons (2145–2146), service.version resource attr (360).
References: §28481991416
Generated by 🚨 Daily Reliability Review · 180.9 AIC · ⌖ 43 AIC · ⊞ 5.5K · ◷
Executive Summary
Reliability triage for
github/gh-awover the last 24h (Sentry orggithub, projectgh-aw, run §28481991416).Telemetry is flowing — the
spansdataset has fresh data through 2026-06-30T21:56Z. Health is degraded, not down: most spans areok, and long-running agent spans (up to ~39 min, statusok) are normal. Failures concentrate in three recurring classes: (1) a failed agent invocation with repeated model-call errors, (2) a recurring 60-second timeout producingerrorspans across many runs, and (3) short-lived gateway request errors.Two correctness caveats shape this report:
search_eventsand noget_trace_details— all queries usedlist_events(Sentry query syntax) with client-side aggregation.list_eventsrenders only a fixed field subset and does not surface custom attributes, so per-workflow-name attribution was not possible from the query side. Findings are reported at transaction granularity (invoke_agent,gateway.request).Companion datasets
errorsandlogsboth returned No results (24h) — see Notes; treat as an observability gap, not proof of zero errors.Top Reliability Findings
invoke_agent(agent run)7570a964...: rootinvoke_agenterror, dur 31,686 ms; 12gen_aichild spanserrorclustered at 785–788 ms (max 788.55 ms) + someok. 2nd trace46769f6b...:invoke_agent77 mserror+ 2gen_aierrorgen_aispans; confirm whether retry/back-off or a 4xx/5xx hard-fail; verifygh-aw.run.statusis ERROR for these runsgateway.request→POST /mcp/agenticworkflowserrorspansdefault-operrorspans pinned at ~59.99 s (max 59,996 ms) across ~13 distinct traces (dff68ea...×4,34d005f6...×3,4072957004ae×3,d17d3c3c...×2,a52e1349...×2,a8526a4d...×2,b7ac182e...×2, + singles). Indff68ea...parenthttp.server gateway.requestreportsokat 60.0 s while childdefaultspanserror— status-propagation gaperrorto the parent span statusgateway.request(HTTP layer)default-operrorspans ~7 ms–0.9 s across many traces (0717caf0...,8534bd3d...,6017ae24...,c2a81c7a...,85a44aec...,a070c915...)find_releases→ No releases despiteservice.versionbeing emitted (resource attr → Sentryrelease).has:gen_ai.response.finish_reasons→ 0 though emitted on the conclusion span (send_otlp_span.cjs:2146).gh_aw.workflow_nameabsent — correct key isgh-aw.workflow.name(present)service.version→release; verify finish-reasons indexing; report truncation as inconclusive until queryableRepresentative Traces
View representative traces (continuity verified)
P1 — failed agent invocation ·
7570a964f4c8045176ccd886c805ef1ainvoke_agenterror, 31,686 ms. Children: 12×gen_aierror@ ~785–788 ms (tight cluster ⇒ same repeated failure), plusgen_aiok@ 23,425 ms / 7,508 ms. Parent→child lineage intact under oneinvoke_agenttransaction.P2 — 60 s timeout ·
dff68ea53f6e74d5bea86470a277acf9gateway.request→POST /mcp/agenticworkflows. 5×http.server@ ~60,013 ms (ok), 4×gen_ai@ ~60,002 ms (ok), 4×default@ ~59,993 ms (error), plusgh-aw.activation.setup@ 11,618 ms / 2,014 ms. Continuity intact; the timeout failure is visible only on thedefaultchildren, not the parent.P3 — gateway transport error ·
0717caf0b117d7a7b142def7f78ab8f6gateway.requestdefault-op spanerror@ 6.7 ms — fast-fail at the HTTP layer.Recommendations
gen_aierror spans in trace7570a964...for the provider status/message; the uniform duration strongly implies one repeated error (rate-limit/4xx) rather than diverse failures. Confirmgh-aw.run.status=failureis recorded so these are dashboard-visible.defaulterrorup to the parentgateway.requestspan so the parent is not reportedokat 60.0 s (currently it is).service.versionis emitted (send_otlp_span.cjs:360) butfind_releasesis empty — register releases or fix theservice.version→Sentry-release mapping so regressions can be compared across versions.gen_ai.response.finish_reasonsis emitted (send_otlp_span.cjs:2146) yet not queryable here, anderrors/logsdatasets are empty — verify ingestion/indexing so runaway-token and export failures are detectable rather than inconclusive.Notes
View notes — missing telemetry, ambiguous fields, tool limits
search_eventsandget_trace_detailsare not available; usedlist_events+ client-side aggregation.list_eventscaps results (~100/query) and renders a fixed field set, so counts are from sampled queries and per-workflow-name attribution was not possible from the query side.errorsdataset: No results (24h) — no Sentry Issues/error events ingested forgh-aw. Reliability signal here rests entirely on thespansdataset.logsdataset: No results (24h).has:on spans):span.status✅ ·gh-aw.workflow.name✅ (note:gh_aw.workflow_name❌ — naming mismatch) ·gh-aw.run.status✅ ·release✅ (but no registered Releases) ·service.versionnot queryable as a span attr (resource→release) ·gen_ai.response.finish_reasons❌ viahas:despite being emitted.gen_ai.response.finish_reasons:lengthreturned no results, but the attribute is not queryable in this build, so neither presence nor absence of truncation is confirmed.errortimeouts, notcancelled.gen_aispans up to ~2,353 s (~39 min) are statusokand treated as normal agent execution, not failures.actions/setup/js/send_otlp_span.cjs: workflow idgh-aw.workflow.name(1297/2068),gh-aw.run.status(2076), OTLPstatus.code/messageERROR=2 (301–333/2049), finish-reasons (2145–2146),service.versionresource attr (360).References: §28481991416