Skip to content

fix(test): make canvas breakpoint-frame tests wait on readiness instead of fixed sleeps#57

Merged
DavidBabinec merged 1 commit into
mainfrom
fix/canvas-progressive-frame-test-flake
Jun 13, 2026
Merged

fix(test): make canvas breakpoint-frame tests wait on readiness instead of fixed sleeps#57
DavidBabinec merged 1 commit into
mainfrom
fix/canvas-progressive-frame-test-flake

Conversation

@DavidBabinec

Copy link
Copy Markdown
Contributor

Problem

The canvas-breakpoint tests flaked in CI — specifically the Verify step of the release workflow (this blocked the v0.0.4 release). They failed reliably on the CI runner but passed locally, which is the tell-tale sign of a timing assumption rather than a logic bug.

Failing tests (all in src/__tests__/canvas/):

  • canvas breakpoint activation — 4 tests, timing out at ~1010–1048ms
  • canvas breakpoint rendering — 3 tests, failing fast at ~16–144ms

Root cause (a real, specific issue)

The canvas reveals breakpoint frames progressively (useProgressiveCanvasFrameLoading): the active frame after a requestAnimationFrame, then each inactive frame behind a chained setTimeout(32ms)requestIdleCallback({timeout:160ms}), and only then mounts each frame's iframe and portals the page tree into it.

Every failing test queries the mobile frame. With desktop active (the default), mobile is the first inactive frame, so it only appears after that whole async chain. But the tests waited with fixed budgets:

  • a hardcoded setTimeout(90ms) "flush" helper (copy-pasted across 4 test files), and
  • the default 1000ms waitFor timeout.

Both are tuned for a fast laptop. Under CI event-loop saturation (this run was ~1.8× slower overall) those timers drift past the budgets, so the iframe content isn't mounted when the assertion runs.

Proven, not guessed

I reproduced the exact CI signature locally by inflating the inactive-frame reveal delay (32ms1500ms): the activation tests timed out at ~1010–1048ms and the rendering tests failed at ~110ms — matching CI. With this change applied, the same slow-runner simulation passes 17/17 (taking ~6.6s, i.e. the waiters correctly absorb the delay instead of giving up).

Fix

Replace every arbitrary sleep with condition-based waiting:

  • New shared helpers in iframeCanvasQuery.tswaitForCanvasFrameDocument, waitForCanvasNodeInFrame, waitForCanvasElement — poll the real readiness condition with a CI-tolerant 5s ceiling. waitFor returns the instant the condition holds, so there's zero added cost on a fast machine; it only widens headroom for a slow one. The progressive reveal is bounded, so a healthy canvas always settles well within 5s and an unhealthy one fails loudly rather than hanging.
  • Removes the flushProgressiveCanvasFrames 90ms-sleep helper that was duplicated across breakpointProps, canvasFormControls, and nodeRendererLockdown.

No product code changes — the progressive loader is correct behaviour (it avoids mounting three heavy iframes at once); only the tests were making brittle timing assumptions. progressiveCanvasLoading.test.tsx is intentionally left as-is: there the staggered timing is the behaviour under test.

Verification

  • Affected canvas tests: 17/17 pass normally, and 17/17 under the 1500ms slow-runner simulation that previously failed.
  • Full suite: bun test 5431 pass / 0 fail; bun run build and bun run lint clean.

🤖 Generated with Claude Code

The canvas-breakpoint tests flaked in CI (Verify step of the release
workflow), failing reliably on a loaded runner while passing locally.

Root cause: the canvas reveals breakpoint frames progressively
(useProgressiveCanvasFrameLoading — the active frame after a
requestAnimationFrame, each inactive frame behind a chained
setTimeout(32ms) → requestIdleCallback({timeout:160ms})), then mounts each
frame's iframe and portals the page tree in. The failing tests all query
the `mobile` frame, which — with `desktop` active — is the FIRST inactive
frame, so it appears only after that whole async chain. The tests waited
with fixed budgets: a hardcoded setTimeout(90ms) "flush" and the default
1000ms waitFor. Those are calibrated for a fast laptop; under CI
event-loop saturation the timers drift past both, so the iframe content
isn't present when the assertion runs.

Proven by inflating the reveal delay to 1500ms locally, which reproduced
the exact CI signature (activation tests timing out ~1010ms, rendering
tests failing ~110ms); with this change the same simulation passes 17/17.

Fix: replace every arbitrary sleep with condition-based waiting. Add
shared waitForCanvasFrameDocument / waitForCanvasNodeInFrame /
waitForCanvasElement helpers in iframeCanvasQuery.ts that poll the real
readiness condition with a CI-tolerant 5s ceiling (waitFor returns the
instant the condition holds, so zero cost on a fast machine). This also
removes the flushProgressiveCanvasFrames helper that was copy-pasted
across four test files. Product code is untouched — the progressive
loader is correct; the tests were making brittle timing assumptions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Comment on lines +8 to +13
import {
getCanvasFrameDocument,
queryCanvasNodeInFrame,
waitForCanvasFrameDocument,
waitForCanvasNodeInFrame,
} from './iframeCanvasQuery'
@DavidBabinec DavidBabinec merged commit 0d7bba8 into main Jun 13, 2026
6 checks passed
@DavidBabinec DavidBabinec deleted the fix/canvas-progressive-frame-test-flake branch June 13, 2026 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant