Skip to content

Federation & Replication Quick Start

Overview

Federation and replication enable seamless data distribution across multiple nodes, regions, and even clouds while maintaining consistency and high availability.

Key Concepts

Federation vs Replication

  • Replication: Full or partial data copies across nodes
  • Federation: Distributed queries across multiple databases
  • Hybrid: Combine both for flexibility

When to Use

  • Multi-region deployments
  • Disaster recovery
  • Geographic data locality
  • High availability (no single point of failure)
  • Load distribution

Quick Start

1. Setup Replication

Primary-Secondary Setup

-- On PRIMARY node
CREATE PUBLICATION main_pub FOR ALL TABLES;

-- On SECONDARY node
CREATE SUBSCRIPTION main_sub
  CONNECTION 'postgresql://primary-host:5432/database'
  PUBLICATION main_pub;

-- Verify replication
SELECT * FROM pg_stat_replication;

2. Multi-Master Replication

Allows writes on multiple nodes (automatic conflict resolution).

-- Node 1: Enable multi-master
SET multi_master_enabled = true;
SET node_id = 1;
SET master_nodes = 'node2:5432,node3:5432';

-- Node 2: Same configuration
SET multi_master_enabled = true;
SET node_id = 2;
SET master_nodes = 'node1:5432,node3:5432';

-- Write on either node
INSERT INTO products (name, price) VALUES ('Widget', 99.99);

-- Changes propagate automatically

3. Sharding Strategy

Distribute data by key range or hash.

-- Hash-based sharding
CREATE TABLE orders (
  id BIGINT PRIMARY KEY,
  customer_id INT,
  amount DECIMAL(10,2),
  shard_key INT GENERATED ALWAYS AS (customer_id % 4)
) PARTITION BY HASH (customer_id);

-- Create shards
CREATE TABLE orders_shard_0 PARTITION OF orders
  FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE orders_shard_1 PARTITION OF orders
  FOR VALUES WITH (MODULUS 4, REMAINDER 1);
-- ... repeat for shards 2, 3

-- Distribute to nodes
ALTER TABLE orders_shard_0 SET (shard_location = 'node1');
ALTER TABLE orders_shard_1 SET (shard_location = 'node2');

4. Geographic Distribution

Deploy across multiple regions.

-- Define regions
CREATE REGION 'us-west' WITH (PRIMARY, REPLICA);
CREATE REGION 'eu-central' WITH (PRIMARY, REPLICA);
CREATE REGION 'ap-southeast' WITH (REPLICA);

-- Create geo-replicated table
CREATE TABLE global_config (
  id SERIAL PRIMARY KEY,
  key VARCHAR(100),
  value TEXT,
  modified_at TIMESTAMP
) WITH (
  replication_regions = 'us-west,eu-central,ap-southeast',
  replication_strategy = 'eventual_consistency'
);

-- Ensure regional presence
SELECT region_name, status FROM pg_regions;

5. Failover Configuration

Automatic failover on primary failure.

-- Configure failover
ALTER SYSTEM SET auto_failover = true;
ALTER SYSTEM SET failover_timeout = '30s';
ALTER SYSTEM SET failure_detection_method = 'heartbeat';

-- Promote secondary
SELECT pg_promote();  -- On standby

-- Check cluster status
SELECT * FROM pg_nodes WHERE node_type = 'primary';

Common Use Cases

Use Case 1: High Availability

-- Primary in datacenter A, secondary in datacenter B
-- If primary fails, secondary takes over automatically
SET high_availability_mode = 'active_passive';

-- Monitor replication lag
SELECT
  usename,
  application_name,
  write_lag,
  flush_lag,
  replay_lag
FROM pg_stat_replication;

Use Case 2: Read Scaling

-- Route reads to closest replica
-- Writes go to primary
-- Distributed load balancing

SELECT * FROM query_result_cache
WHERE node_id = 'nearest_node'  -- Automatic routing
LIMIT 100;

Use Case 3: Disaster Recovery

-- Backup in separate region
CREATE TABLE disaster_recovery_config (
  rpo_minutes INT DEFAULT 15,      -- Recovery Point Objective
  rto_minutes INT DEFAULT 5,       -- Recovery Time Objective
  backup_region VARCHAR(100)
);

-- Regular backups to cloud
BACKUP DATABASE TO 's3://backups/prod';

Use Case 4: Data Locality

-- Store data close to users
-- US customers -> US nodes
-- EU customers -> EU nodes

SELECT * FROM customers
WHERE region = current_region()
-- Automatically routed to regional node

Performance Optimization

1. Replication Settings

-- Optimize for throughput
SET wal_level = 'replica';
SET max_wal_senders = 10;
SET wal_keep_size = '1GB';

-- Monitor replication performance
SELECT slot_name, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots;

2. Sharding Considerations

-- Monitor shard balance
SELECT shard_location, COUNT(*) as row_count
FROM orders
GROUP BY shard_location;

-- Rebalance if needed
ALTER TABLE orders REBALANCE SHARDS;

3. Conflict Resolution

-- For multi-master, define conflict resolution
SET conflict_resolution_policy = 'last_write_wins';
-- OR
SET conflict_resolution_policy = 'custom_function';
CREATE FUNCTION resolve_conflict(v1 ANY, v2 ANY) RETURNS ANY
  AS $$ SELECT CASE WHEN v1 > v2 THEN v1 ELSE v2 END; $$
  LANGUAGE SQL;

Monitoring & Health

-- Cluster health check
SELECT
  node_name,
  status,
  last_heartbeat,
  lag_bytes
FROM pg_cluster_health;

-- Replication slots status
SELECT slot_name, slot_type, restart_lsn FROM pg_replication_slots;

-- Lag monitoring
SELECT
  EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())) as replication_lag_seconds;

Troubleshooting

Q: Replication lagging?

A: Check network, increase wal_buffers, reduce replication rate.

Q: Shards unbalanced?

A: Run REBALANCE SHARDS or redistribute manually.

Q: Failover not automatic?

A: Check auto_failover setting and ensure failure detection is working.

Best Practices

  1. Monitor Continuously

    -- Track replication lag
    SELECT * FROM pg_stat_replication ORDER BY write_lag DESC;
    

  2. Plan Capacity

    -- Calculate shards needed
    shard_count = total_rows / target_shard_size;
    

  3. Test Failover

    -- Regular failover drills
    -- Promote standby in test environment
    

  4. Secure Communication

    -- Use SSL for replication
    ALTER SYSTEM SET ssl = on;
    ALTER SYSTEM SET ssl_cert_file = '/path/to/cert.pem';
    

Next Steps

  1. Read /docs/guides/user/REPLICATION_OPERATIONS_GUIDE.md for operational details
  2. Check sharding strategy guide: /docs/guides/user/SHARDING_STRATEGY_DECISION_GUIDE.md
  3. Review multi-master guide: /docs/guides/user/MULTI_MASTER_REPLICATION_OPERATIONS.md
  • High Availability: /docs/guides/user/HIGH_AVAILABILITY_OPERATIONAL_PROCEDURES.md
  • Multi-Cloud: /docs/guides/user/MULTI_CLOUD_FEDERATION_GUIDE.md
  • Backup & Recovery: /docs/guides/user/BACKUP_RECOVERY_GUIDE.md

Document Version: 1.0 Last Updated: December 30, 2025 Audience: Database architects, DevOps engineers Reading Time: 10 minutes