Troubleshooting

Diagnose and resolve common issues with ScryData proxy including connection problems, performance bottlenecks, and circuit breaker behavior.

Connection Issues

Connection problems are often the first issues encountered when setting up ScryData. Here are the most common causes and solutions.

SSL Handshake Failures

Symptom

Clients fail to connect with errors like "SSL SYSCALL error" or "certificate verify failed".

Causes

  • Mismatched TLS versions between client and proxy
  • Invalid or expired SSL certificates
  • Certificate chain not properly configured
  • Client expecting TLS but proxy configured with client_tls_sslmode = "disable"

Solutions

  1. Verify certificate validity:
    openssl x509 -in /etc/scry/server.crt -text -noout | grep -A2 "Validity"
  2. Test SSL connection:
    openssl s_client -connect localhost:5433 -servername localhost
  3. Check TLS configuration matches:
    # Ensure client and proxy TLS modes are compatible
    [proxy]
    client_tls_sslmode = "prefer"  # or "require" for strict TLS

Authentication Failures

Symptom

Connections rejected with "password authentication failed" or "no pg_hba.conf entry" errors.

Causes

  • Incorrect username or password in client connection string
  • Auth file (userlist.txt) not found or has wrong permissions
  • Password hash format mismatch (MD5 vs SCRAM-SHA-256)
  • Auth query returning incorrect data

Solutions

  1. Verify auth file exists and is readable:
    ls -la /etc/scry/userlist.txt
    cat /etc/scry/userlist.txt
  2. Check auth type matches password format:
    [auth]
    auth_type = "scram-sha-256"  # Must match hash format in userlist.txt
    auth_file = "/etc/scry/userlist.txt"
  3. Test credentials directly against PostgreSQL:
    psql -h localhost -p 5432 -U your_user -d your_database

Performance Issues

Performance problems typically manifest as slow queries or timeouts. Here's how to diagnose and resolve them.

High Latency

When queries through ScryData are slower than expected, check these metrics:

Metric What It Means Action
scry_query_duration_seconds Total query round-trip time If high, check backend database performance
scry_pool_wait_time_seconds Time waiting for available connection If high, increase pool_size
scry_connection_acquire_time_seconds Time to establish new connection If high, check network latency to backend
scry_backend_latency_ema Exponential moving average of backend latency Compare against baseline; spikes indicate backend issues

Solutions

  • Increase pool size if pool_wait_time is consistently high
  • Check backend database for slow queries or resource contention
  • Review network path between proxy and database for latency
  • Enable query logging to identify slow individual queries

Connection Pool Exhaustion

Symptom

Clients timeout waiting for connections, or you see "connection pool exhausted" errors in logs. The metric scry_pool_available_connections stays at or near zero.

Solutions

  1. Increase pool size:
    [backend]
    pool_size = 50  # Increase from default of 10
  2. Switch to transaction pooling mode if using session mode:
    [backend]
    pool_mode = "transaction"  # Releases connections after each transaction
  3. Reduce connection timeout to fail fast instead of queuing:
    [backend]
    connection_timeout_ms = 3000  # Fail after 3 seconds
  4. Check for connection leaks in your application (connections not being closed)

Circuit Breaker Issues

The circuit breaker protects your database from cascading failures, but misconfiguration can cause unexpected behavior.

Circuit Keeps Opening

If the circuit breaker opens too frequently, preventing legitimate traffic:

What to Check

  1. Review failure threshold:
    # Check current circuit breaker state
    curl http://localhost:9090/metrics | grep circuit
    
    # Look for:
    # scry_circuit_breaker_state (0=closed, 1=open, 2=half-open)
    # scry_circuit_breaker_failures_total
  2. Check what's causing failures:
    # Enable debug logging
    RUST_LOG=scry_proxy::resilience=debug scry-proxy
  3. Verify backend health:
    psql -h backend-host -p 5432 -U postgres -c "SELECT 1"

Threshold Tuning

Adjust thresholds based on your workload characteristics:

[resilience.circuit_breaker]
enabled = true

# For high-traffic, occasional-error workloads:
failure_threshold = 15      # More tolerant of sporadic failures
success_threshold = 3       # Requires more successes to close
open_timeout_secs = 120     # Longer recovery window

# For low-traffic, error-sensitive workloads:
failure_threshold = 3       # Opens quickly on errors
success_threshold = 1       # Recovers faster
open_timeout_secs = 30      # Shorter recovery window

Tip

Enable use_health_monitor = true to integrate circuit breaker decisions with active health checks. This prevents the circuit from opening due to slow queries when the backend is actually healthy.

Getting Help

If you've tried the solutions above and still have issues, here's how to get help.

Enable Debug Logging

Capture detailed logs to help diagnose the issue:

# Full debug logging
RUST_LOG=debug scry-proxy

# Component-specific logging
RUST_LOG=scry_proxy::pool=debug,scry_proxy::resilience=debug scry-proxy

# Log to file for analysis
RUST_LOG=debug scry-proxy 2>&1 | tee scry-debug.log

Open a GitHub Issue

When opening an issue, please include:

  • ScryData version: Run scry-proxy --version
  • Operating system and version
  • PostgreSQL version
  • Configuration file (with sensitive values redacted)
  • Relevant log output with RUST_LOG=debug
  • Steps to reproduce the issue
  • Expected vs actual behavior
# Collect system information
echo "ScryData version: $(scry-proxy --version)"
echo "OS: $(uname -a)"
echo "PostgreSQL: $(psql --version)"

Open issues at: github.com/scrydata/scry-proxy/issues

Need More Help?

Get early access to ScryData and receive priority support from our team.

Request Early Access