Configuration Priority
Configuration is loaded in priority order (highest to lowest):
Environment Variables (SCRY_BACKFILL_*)
|
v
Configuration File (scry-backfill.toml)
|
v
Default Values
This allows you to set defaults in scry-backfill.toml and override specific values with environment variables in production. Nested configuration uses double underscores (__):
# source.host -> SCRY_BACKFILL_SOURCE__HOST
export SCRY_BACKFILL_SOURCE__HOST="localhost"
# producer.shadow_id -> SCRY_BACKFILL_PRODUCER__SHADOW_ID
export SCRY_BACKFILL_PRODUCER__SHADOW_ID="shadow-prod-001"
Complete Configuration Example
A full configuration file showing all available options:
[backfill]
mode = "hybrid"
include_tables = ["public.*"]
exclude_tables = ["public.audit_logs", "public.sessions"]
checkpoint_interval_secs = 60
checkpoint_path = "/var/lib/scry-backfill/checkpoint.json"
[source]
host = "localhost"
port = 5432
database = "production"
user = "replication_user"
password = "${SCRY_BACKFILL_SOURCE__PASSWORD}"
replication_slot = "scry_backfill_slot"
publication_name = "scry_backfill_pub"
ssl_mode = "prefer"
[producer]
shadow_id = "shadow-prod-001"
endpoint = "https://api.scrydata.io/v1/ingest"
auth_token = "${SCRY_BACKFILL_PRODUCER__AUTH_TOKEN}"
[rate_limiter]
enabled = true
max_rows_per_second = 10000
max_bytes_per_second = 10485760
burst_multiplier = 2.0
[wal_monitor]
enabled = true
check_interval_secs = 30
max_wal_bytes_warning = 1073741824
max_wal_bytes_critical = 5368709120
Backfill Settings
Control the overall backfill behavior and table selection.
| Parameter | Default | Description |
|---|---|---|
mode |
hybrid |
Operation mode: "hybrid", "snapshot", or "cdc". Hybrid automatically selects based on database state. |
include_tables |
["*.*"] |
Glob patterns for tables to include. Supports schema.table format. |
exclude_tables |
[] |
Glob patterns for tables to exclude. Takes precedence over include_tables. |
checkpoint_interval_secs |
60 |
Seconds between checkpoint saves. Lower values provide better crash recovery at the cost of I/O. |
checkpoint_path |
./checkpoint.json |
Path to store checkpoint file for resumable backfill operations. |
Source Database Settings
Configure the connection to your PostgreSQL source database.
| Parameter | Default | Description |
|---|---|---|
host |
localhost |
PostgreSQL server hostname or IP address. |
port |
5432 |
PostgreSQL server port. |
database |
— | Database name to replicate from. |
user |
— | Username for database connection. Must have replication privileges. |
password |
— | Password for database connection. Use environment variable for security. |
replication_slot |
scry_backfill |
Name of the logical replication slot to use or create. |
publication_name |
scry_backfill |
Name of the PostgreSQL publication for logical replication. |
ssl_mode |
prefer |
SSL mode: "disable", "allow", "prefer", "require", "verify-ca", "verify-full". |
Producer Settings
Configure how data is sent to scry-platform.
| Parameter | Default | Description |
|---|---|---|
shadow_id |
— | REQUIRED. Unique identifier for the shadow database in scry-platform. |
endpoint |
https://api.scrydata.io/v1/ingest |
scry-platform ingestion endpoint URL. |
auth_token |
— | Authentication token for scry-platform API. Use environment variable for security. |
Security Warning: Never store sensitive credentials like password or auth_token directly in configuration files. Always use environment variables (SCRY_BACKFILL_SOURCE__PASSWORD, SCRY_BACKFILL_PRODUCER__AUTH_TOKEN) or reference them using ${ENV_VAR} syntax in your TOML file.
Rate Limiter Settings
Control the rate of data extraction to avoid impacting production database performance.
| Parameter | Default | Description |
|---|---|---|
enabled |
true |
Enable rate limiting. Recommended for production databases. |
max_rows_per_second |
10000 |
Maximum number of rows to process per second during snapshot mode. |
max_bytes_per_second |
10485760 |
Maximum bytes per second (default: 10 MB/s). Limits network and disk I/O impact. |
burst_multiplier |
2.0 |
Allows temporary bursts up to this multiple of the rate limit for catching up. |
Tip: Start with conservative rate limits and increase gradually while monitoring your database's CPU and I/O metrics. See Rate Limiting & WAL for detailed tuning guidance.
WAL Monitor Settings
Monitor PostgreSQL Write-Ahead Log (WAL) to prevent disk exhaustion from replication lag.
| Parameter | Default | Description |
|---|---|---|
enabled |
true |
Enable WAL size monitoring. Highly recommended for production. |
check_interval_secs |
30 |
Seconds between WAL size checks. |
max_wal_bytes_warning |
1073741824 |
WAL size threshold (default: 1 GB) that triggers a warning log and metric. |
max_wal_bytes_critical |
5368709120 |
WAL size threshold (default: 5 GB) that pauses replication to prevent disk exhaustion. |
Note: When the critical threshold is reached, scry-backfill will pause and wait for WAL to be consumed before resuming. This prevents the replication slot from filling your disk. Monitor the scry_backfill_wal_bytes metric in production.
Ready to Configure scry-backfill?
Get early access and start creating shadow databases for migration testing.
Request Early Access