High-Level Overview
ScryData sits between your application and database, transparently intercepting and forwarding queries while extracting metadata for observability:
┌──────────┐ ┌────────────────────────────┐ ┌──────────┐
│ │ │ scry-proxy │ │ │
│ Client │────────▶│ - Circuit Breaker │────────▶│ Postgres │
│ (App) │◀────────│ - Connection Pool │◀────────│ Database │
│ │ │ - Event Publisher │ │ │
└──────────┘ └────────────────────────────┘ └──────────┘
│ │
│ Query Events │ Logical Replication
│ (async) │
▼ ▼
┌─────────────────────────────────────────────────────┐
│ scry-platform │
│ - Query Analysis │
│ - Shadow Database │
│ - Migration Validation │
└─────────────────────────────────────────────────────┘
▲
│ Schema + Data
│ (rate-limited)
┌─────────────────────────┐
│ scry-backfill │
│ - CDC Streaming │
│ - Snapshot Mode │
└─────────────────────────┘
▲
│ WAL / Snapshot
┌─────────────────────────┐
│ Postgres Database │
└─────────────────────────┘
Key Design Principles
- Transparency: Drop-in replacement for direct database connection
- Low Overhead: ~100μs target latency addition through async operations and lock-free data structures
- Best-Effort Observability: Events published asynchronously, never block queries
- Resilience: Circuit breaker, retries, and health checks protect your database
Two Data Paths
ScryData uses two complementary data paths to capture both query patterns and actual data for migration validation:
In-Band Path: Query Capture
The in-band path captures live query traffic as it flows through your application:
Application → scry-proxy → PostgreSQL
- scry-proxy sits between your application and database
- Transparently forwards all queries with minimal latency (~100μs)
- Extracts query metadata (SQL, timing, parameters) asynchronously
- Publishes query events to scry-platform for analysis
- Enables real-time query replay against shadow databases
Out-of-Band Path: Data Replication
The out-of-band path replicates your database schema and data independently:
PostgreSQL → scry-backfill → scry-platform
- scry-backfill connects directly to PostgreSQL's replication stream
- Uses logical replication (CDC) for continuous, low-impact data sync
- Supports snapshot mode for initial data loading
- Rate-limited to prevent overwhelming production databases
- Keeps shadow database synchronized for accurate migration testing
Why Two Paths?
This dual-path architecture provides complete migration validation:
- Query patterns from scry-proxy show how your application uses the database
- Actual data from scry-backfill ensures shadow databases have realistic content
- Independent operation means either path can run without the other
- Zero application changes required for either path
Request Flow
Here's how a query flows through ScryData from client to database:
1. Client Sends Query
SQL query arrives via PostgreSQL wire protocol. ScryData accepts the connection transparently.
2. Circuit Breaker Check
Lock-free atomic check (~10-50ns). If circuit is open, request fails fast without hitting the database.
3. Connection Pool Acquisition
Get a healthy connection from the pool. May create new connection if needed, including health check and state reset.
4. Backend Execution
Query forwarded to PostgreSQL. Response streamed back to client.
5. Event Publishing (Async)
Query metadata sent to event batcher via lock-free channel. Never blocks the response.
6. Metrics Recording
Latency, success/failure, and other metrics recorded atomically (<300ns overhead).
Query Timeline Phases
Every query goes through these measured phases, exposed via /debug/timeline and Prometheus metrics:
| Phase | Description | Typical Duration |
|---|---|---|
| Queue Time | Waiting before pool acquisition starts | <1ms |
| Pool Acquire | Getting a connection (may include health check + state reset) | 100-500μs |
| Backend Execution | Actual query execution on database | Variable |
| Event Publishing | Async event dispatch (not counted in query latency) | <100ns |
Core Components
1. Proxy Server
The main entry point that:
- Listens for incoming TCP connections on the proxy port
- Spawns a connection handler for each client
- Manages graceful shutdown with connection draining
- Tracks active connections
2. Circuit Breaker
Lock-free, three-state circuit breaker protecting the backend. Uses AtomicU8 for state and AtomicU32 for counters—no locks, predictable latency.
Learn more about the Circuit Breaker →
3. Connection Pool
Protocol-agnostic TCP connection pooling with deadpool integration. Includes health checks on every recycle and automatic state reset (DISCARD ALL).
Learn more about Connection Pooling →
4. Event Publisher
Trait-based abstraction for publishing query events. Supports debug logging and HTTP publishing with FlexBuffers serialization for 50% size reduction vs JSON.
Learn more about Observability →
5. Health Monitor
Predictive health monitoring using EMA baselines. Tracks error rate, latency (P99), and pool utilization. Warns when metrics deviate from baseline.
Learn more about Health Checks →
6. Metrics System
Central metrics singleton tracking all proxy operations. HDR histograms for accurate percentiles, atomic counters, and hot data tracking with Count-Min Sketch + Top-K heap.
Why Async?
Async architecture allows ScryData to handle thousands of concurrent connections with minimal memory:
| Model | 1,000 Connections | Memory Usage |
|---|---|---|
| Thread-per-connection | 1,000 threads | 8GB+ (8MB stack each) |
| Async (Tokio) | 1,000 tasks | ~8MB total |
ScryData is built entirely on Tokio, the industry-standard async runtime for Rust that powers production systems at Discord, AWS, and Microsoft.
Why Lock-Free?
Locks can cause unpredictable latency spikes. Lock-free atomics ensure:
- Consistent latency: No lock contention delays
- Composability: Safe to call from any async context
- Simplicity: No deadlock concerns
Critical path operations use lock-free atomics:
- Circuit breaker state transitions:
AtomicU8::compare_exchange - Metrics counters:
AtomicU64::fetch_add - Event batching:
tokio::mpsc::Sender::try_send(lock-free channel)
Protocol Handling
ScryData uses the PostgreSQL wire protocol for communication. Key message types extracted:
| Message Type | Tag | Purpose |
|---|---|---|
| Query | 'Q' | Simple query protocol |
| Parse | 'P' | Extended query (prepared statements) |
| CommandComplete | 'C' | Query completion with row count |
| ErrorResponse | 'E' | Query errors with SQLSTATE |
The Protocol trait abstraction allows ScryData to support multiple databases in the future (MySQL, MongoDB) via feature flags.
Performance Characteristics
Latency Budget
Target: ~100μs additional latency per query
Total ScryData Overhead: ~500μs (0.5ms) typical
Memory Footprint
- Base: ~10MB (Tokio runtime, binary code)
- Connection pool: ~50KB per connection
- Metrics: ~150KB (histograms, hot data tracker)
- Event batcher: ~100KB per 1000 queued events
Total for 100 connections: ~20MB
Throughput
- Tested: 10,000+ queries/sec on commodity hardware
- Bottleneck: Usually backend database, not ScryData
- Scaling: Linear scaling with CPU cores (Tokio work-stealing)
Ready to See It in Action?
Get early access to ScryData and start validating your database migrations with production traffic.
Request Early Access