Resilience Layer
ScryData's resilience features work together to handle failures gracefully:
┌────────────────────────────────────────────────────┐
│ Resilience Layer │
│ │
│ 1. Circuit Breaker │
│ Check if requests allowed │
│ If circuit open → Fail fast (<1ms) │
│ │
│ 2. Connection Retry │
│ Exponential backoff with jitter │
│ Retry on transient failures │
│ │
│ 3. Health Monitoring │
│ Active checks + Passive checks + EMA tracking │
│ Predictive circuit opening │
│ │
│ Result: Robust, self-healing system │
└────────────────────────────────────────────────────┘
Key Benefits
- Fast Failure: Circuit breaker fails requests in <1ms when database is down
- Automatic Recovery: Retry logic handles transient failures
- Predictive Protection: Health monitoring detects issues before cascade
- Database Protection: Prevent overwhelming a struggling database
Connection Retry
Automatic retry with exponential backoff for connection failures:
Backoff Formula
backoff = min(initial_backoff × multiplier^(attempt-1), max_backoff)
jitter = random(0, backoff × jitter_factor)
total_delay = backoff + jitter
Example (Default Settings)
| Attempt | Backoff | With Jitter |
|---|---|---|
| 1 | 50ms | 50-55ms |
| 2 | 100ms | 100-110ms |
| 3 | 200ms | 200-220ms |
Why Jitter?
Without jitter, all simultaneous requests retry at the same time (thundering herd). With jitter, retries are spread out, distributing database load.
What Gets Retried
Retried: Connection refused, connection timeout, network unreachable, TLS handshake failed
Not Retried: Authentication failed, query syntax error, permission denied, circuit breaker open
Feature Integration
Circuit Breaker + Retry
Request → Circuit Breaker
↓
┌────┴────┐
│ Open │ → Reject immediately (no retry)
└─────────┘
↓
┌────┴────┐
│ Closed │ → Retry Logic → Database
└─────────┘
↓
If all retries fail → Circuit breaker records failure
Health Monitor + Circuit Breaker
Health monitor tracks error rate, latency, and pool utilization. When status becomes Unhealthy, circuit breaker opens predictively—before failures cascade.
All Features Together: Database Outage
t=0: Outage Starts
Requests fail after retry attempts. Circuit breaker counts failures.
t=5: Circuit Opens
5 failures reached. Health monitor shows Unhealthy. Circuit opens.
t=6+: Fail Fast Mode
All requests rejected in <1ms. No retries. Database protected.
t=66: Circuit Half-Open
60s elapsed. Limited requests test database recovery.
t=67: Recovery Detected
Test succeeds. Circuit closes. Normal operation resumes.
Configuration
# Circuit Breaker
[resilience.circuit_breaker]
enabled = true
failure_threshold = 5
success_threshold = 2
open_timeout_secs = 60
use_health_monitor = true
# Connection Retry
[resilience.connection_retry]
enabled = true
max_attempts = 3
initial_backoff_ms = 50
max_backoff_ms = 5000
backoff_multiplier = 2.0
jitter_factor = 0.1
# Active Health Checks
[resilience.healthcheck]
active_enabled = true
interval_secs = 30
timeout_ms = 1000
failure_threshold = 3
Monitoring Resilience
# Circuit breaker state
scry_circuit_breaker_state
# Retry attempts
rate(scry_connection_retry_attempts_total[5m])
# Health status
scry_health_status
# Pool utilization
scry_pool_utilization
Alerting Examples
# Circuit opened
scry_circuit_breaker_state == 1
# Unhealthy status
scry_health_status == 2
# Frequent retries
rate(scry_connection_retry_attempts_total[5m]) > 10
Build a Self-Healing Database Layer
ScryData's integrated resilience features protect your database and recover automatically.
Request Early Access