Circuit Breaker

Lock-free, three-state circuit breaker pattern to protect your database from cascading failures with automatic recovery.

What is a Circuit Breaker?

A circuit breaker prevents your application from repeatedly attempting operations that are likely to fail, protecting both your application and database from additional load.

Analogy: Like an electrical circuit breaker that trips to prevent damage during a short circuit, ScryData's circuit breaker "trips" (opens) when the database becomes unhealthy, failing requests quickly instead of waiting for timeouts.

Why Use a Circuit Breaker?

  • Fast Failure: Return errors in <1ms instead of waiting for timeouts
  • Resource Protection: Don't waste connection pool on failing connections
  • Database Protection: Reduce load on struggling database
  • Automatic Recovery: Automatically test for recovery and restore service

Without vs With Circuit Breaker

Without Circuit Breaker

Client → Proxy → Database (Down)
         ↓
    Wait 5 seconds (timeout)
         ↓
    Return error
         ↓
    Repeat for every request  ← Resources wasted!
                            

With Circuit Breaker

Client → Proxy → Circuit Breaker (Open)
         ↓
    Immediate error (fail fast)  ← <1ms response
         ↓
    No database connection attempt
                            

Three States

                    ┌─────────┐
                    │ CLOSED  │ ← Normal operation
                    │         │   All requests allowed
                    └────┬────┘
                         │ failures >= threshold
                         ▼
                    ┌─────────┐
                    │  OPEN   │ ← Fail fast mode
                    │         │   All requests rejected
                    └────┬────┘
                         │ timeout elapsed
                         ▼
                    ┌─────────┐
        success ←───│HALF-OPEN│ ← Testing recovery
                    │         │   Limited requests allowed
                    └────┬────┘
                         │ any failure
                         ▼
                    Back to OPEN
                            

Closed State

Normal operation — Database is healthy

  • All requests allowed through
  • Track consecutive failures
  • Transition to Open if failures exceed threshold

Open State

Fail fast — Database is down/unhealthy

  • All requests rejected immediately
  • No connection attempts to database
  • Wait for timeout, then transition to HalfOpen

HalfOpen State

Testing recovery — Probing if database is healthy again

  • Limited requests allowed through
  • On success: transition to Closed
  • On any failure: immediately back to Open

Lock-Free Implementation

ScryData's circuit breaker uses atomic operations for <1ms overhead:

10-50ns
Check if allowed
10-50ns
Record success
50-100ns
State transition
0
Locks required

Data structures use atomic types:

  • AtomicU8 for state (0=Closed, 1=Open, 2=HalfOpen)
  • AtomicU32 for consecutive failures/successes
  • AtomicU64 for opened_at timestamp

No locks, no waiting, predictable latency.

Health Monitor Integration

When enabled (use_health_monitor = true), the circuit breaker can open predictively based on health monitoring:

Health Status Description Circuit Behavior
Healthy No warnings Normal operation
Degraded Minor warnings Normal operation
Unhealthy Critical warnings Opens circuit

Critical warnings that trigger Unhealthy:

  • Pool starvation: No available connections with waiting requests
  • Pool saturation: >99% pool utilization
  • Error rate spike: Current error rate >5x baseline

Benefit: Circuit opens before hitting failure threshold, protecting database earlier.

Configuration

Parameter Default Description
enabled true Enable circuit breaker
failure_threshold 5 Consecutive failures to open circuit
success_threshold 2 Consecutive successes to close from half-open
open_timeout_secs 60 Seconds before transitioning to half-open
use_health_monitor true Enable predictive opening via health monitor

Configuration File

[resilience.circuit_breaker]
enabled = true
failure_threshold = 5
success_threshold = 2
open_timeout_secs = 60
use_health_monitor = true

Environment Variables

export SCRY_RESILIENCE__CIRCUIT_BREAKER__ENABLED=true
export SCRY_RESILIENCE__CIRCUIT_BREAKER__FAILURE_THRESHOLD=5
export SCRY_RESILIENCE__CIRCUIT_BREAKER__SUCCESS_THRESHOLD=2
export SCRY_RESILIENCE__CIRCUIT_BREAKER__OPEN_TIMEOUT_SECS=60
export SCRY_RESILIENCE__CIRCUIT_BREAKER__USE_HEALTH_MONITOR=true

Example Scenario: Database Outage

10:00:00 - Database crashes

Requests 1-5 fail with connection refused. Circuit breaker tracks failures.

10:00:05 - Circuit Opens

5 consecutive failures reached. Circuit transitions to Open state.

10:00:06+ - Fail Fast Mode

All requests rejected in <1ms. Database protected from connection flood.

10:01:05 - Circuit Half-Open

60 seconds elapsed. Circuit allows limited requests to test recovery.

10:01:06 - Database Recovered

Test requests succeed. Circuit closes. Normal operation resumes.

Benefits Realized

  • Requests after circuit opened failed in <1ms (vs 5 second timeout)
  • Database not overwhelmed by connection attempts
  • Automatic recovery testing every 60 seconds
  • Automatic service restoration when database recovers

Prometheus Metrics

# Circuit breaker state (0=Closed, 1=Open, 2=HalfOpen)
scry_circuit_breaker_state 0

# Consecutive failures (in Closed state)
scry_circuit_breaker_consecutive_failures 0

# Total requests allowed through
scry_circuit_breaker_requests_allowed_total 1234

# Total requests rejected (circuit open)
scry_circuit_breaker_requests_rejected_total 56

Alerting Examples

# Circuit opened (database issues)
scry_circuit_breaker_state == 1

# Frequent rejections (circuit often open)
rate(scry_circuit_breaker_requests_rejected_total[5m]) > 10

# Circuit flapping (unstable database)
changes(scry_circuit_breaker_state[5m]) > 5

Never Worry About Cascading Failures Again

ScryData's circuit breaker protects your database and provides fast failure during outages.

Request Early Access