Skip to content

Proposal: Optional Circuit Breaker for Provider Health #231

@JoshSalway

Description

@JoshSalway

Context

AI providers periodically experience elevated error rates and partial outages, as reflected on their public status pages.

During these periods, applications relying heavily on AI can experience repeated timeouts and degraded performance.

The current failover implementation in the AI SDK is well designed.

It correctly:

  • Attempts providers in order
  • Fails over only for FailoverableException
  • Emits failover events

This works well for handling individual request failures.

However, during sustained provider instability such as timeouts, 5xx errors, overloads, or rate limits, every request still begins with the primary provider. Even if the provider is clearly unhealthy, each request waits for failure before falling back.

For AI-intensive applications, especially those relying on real-time AI responses, this can lead to:

  • Repeated timeouts across requests
  • Slower user-facing responses
  • Queue congestion under load
  • Increased pressure on already unstable providers

The current design is reactive per request and does not track provider health across requests.


Suggested Improvement

Introduce an optional circuit breaker layer that tracks provider failures over time and temporarily skips providers during sustained instability.

When enabled, the SDK could:

  • Track failures in a rolling window
  • Mark a provider as temporarily unhealthy after N failures
  • Skip unhealthy providers immediately for fast failover
  • Periodically allow probe calls to detect recovery
  • Automatically restore providers when healthy

This would be:

  • Fully opt-in
  • Disabled by default
  • Minimal in scope
  • Backed by Laravel Cache

Fallback protects a single request.

A circuit breaker protects the system.

Would this be worth discussing as an optional resilience enhancement for production AI workloads?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions