Federation & Replication Quick Start¶
Overview¶
Federation and replication enable seamless data distribution across multiple nodes, regions, and even clouds while maintaining consistency and high availability.
Key Concepts¶
Federation vs Replication¶
- Replication: Full or partial data copies across nodes
- Federation: Distributed queries across multiple databases
- Hybrid: Combine both for flexibility
When to Use¶
- Multi-region deployments
- Disaster recovery
- Geographic data locality
- High availability (no single point of failure)
- Load distribution
Quick Start¶
1. Setup Replication¶
Primary-Secondary Setup¶
-- On PRIMARY node
CREATE PUBLICATION main_pub FOR ALL TABLES;
-- On SECONDARY node
CREATE SUBSCRIPTION main_sub
CONNECTION 'postgresql://primary-host:5432/database'
PUBLICATION main_pub;
-- Verify replication
SELECT * FROM pg_stat_replication;
2. Multi-Master Replication¶
Allows writes on multiple nodes (automatic conflict resolution).
-- Node 1: Enable multi-master
SET multi_master_enabled = true;
SET node_id = 1;
SET master_nodes = 'node2:5432,node3:5432';
-- Node 2: Same configuration
SET multi_master_enabled = true;
SET node_id = 2;
SET master_nodes = 'node1:5432,node3:5432';
-- Write on either node
INSERT INTO products (name, price) VALUES ('Widget', 99.99);
-- Changes propagate automatically
3. Sharding Strategy¶
Distribute data by key range or hash.
-- Hash-based sharding
CREATE TABLE orders (
id BIGINT PRIMARY KEY,
customer_id INT,
amount DECIMAL(10,2),
shard_key INT GENERATED ALWAYS AS (customer_id % 4)
) PARTITION BY HASH (customer_id);
-- Create shards
CREATE TABLE orders_shard_0 PARTITION OF orders
FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE orders_shard_1 PARTITION OF orders
FOR VALUES WITH (MODULUS 4, REMAINDER 1);
-- ... repeat for shards 2, 3
-- Distribute to nodes
ALTER TABLE orders_shard_0 SET (shard_location = 'node1');
ALTER TABLE orders_shard_1 SET (shard_location = 'node2');
4. Geographic Distribution¶
Deploy across multiple regions.
-- Define regions
CREATE REGION 'us-west' WITH (PRIMARY, REPLICA);
CREATE REGION 'eu-central' WITH (PRIMARY, REPLICA);
CREATE REGION 'ap-southeast' WITH (REPLICA);
-- Create geo-replicated table
CREATE TABLE global_config (
id SERIAL PRIMARY KEY,
key VARCHAR(100),
value TEXT,
modified_at TIMESTAMP
) WITH (
replication_regions = 'us-west,eu-central,ap-southeast',
replication_strategy = 'eventual_consistency'
);
-- Ensure regional presence
SELECT region_name, status FROM pg_regions;
5. Failover Configuration¶
Automatic failover on primary failure.
-- Configure failover
ALTER SYSTEM SET auto_failover = true;
ALTER SYSTEM SET failover_timeout = '30s';
ALTER SYSTEM SET failure_detection_method = 'heartbeat';
-- Promote secondary
SELECT pg_promote(); -- On standby
-- Check cluster status
SELECT * FROM pg_nodes WHERE node_type = 'primary';
Common Use Cases¶
Use Case 1: High Availability¶
-- Primary in datacenter A, secondary in datacenter B
-- If primary fails, secondary takes over automatically
SET high_availability_mode = 'active_passive';
-- Monitor replication lag
SELECT
usename,
application_name,
write_lag,
flush_lag,
replay_lag
FROM pg_stat_replication;
Use Case 2: Read Scaling¶
-- Route reads to closest replica
-- Writes go to primary
-- Distributed load balancing
SELECT * FROM query_result_cache
WHERE node_id = 'nearest_node' -- Automatic routing
LIMIT 100;
Use Case 3: Disaster Recovery¶
-- Backup in separate region
CREATE TABLE disaster_recovery_config (
rpo_minutes INT DEFAULT 15, -- Recovery Point Objective
rto_minutes INT DEFAULT 5, -- Recovery Time Objective
backup_region VARCHAR(100)
);
-- Regular backups to cloud
BACKUP DATABASE TO 's3://backups/prod';
Use Case 4: Data Locality¶
-- Store data close to users
-- US customers -> US nodes
-- EU customers -> EU nodes
SELECT * FROM customers
WHERE region = current_region()
-- Automatically routed to regional node
Performance Optimization¶
1. Replication Settings¶
-- Optimize for throughput
SET wal_level = 'replica';
SET max_wal_senders = 10;
SET wal_keep_size = '1GB';
-- Monitor replication performance
SELECT slot_name, restart_lsn, confirmed_flush_lsn
FROM pg_replication_slots;
2. Sharding Considerations¶
-- Monitor shard balance
SELECT shard_location, COUNT(*) as row_count
FROM orders
GROUP BY shard_location;
-- Rebalance if needed
ALTER TABLE orders REBALANCE SHARDS;
3. Conflict Resolution¶
-- For multi-master, define conflict resolution
SET conflict_resolution_policy = 'last_write_wins';
-- OR
SET conflict_resolution_policy = 'custom_function';
CREATE FUNCTION resolve_conflict(v1 ANY, v2 ANY) RETURNS ANY
AS $$ SELECT CASE WHEN v1 > v2 THEN v1 ELSE v2 END; $$
LANGUAGE SQL;
Monitoring & Health¶
-- Cluster health check
SELECT
node_name,
status,
last_heartbeat,
lag_bytes
FROM pg_cluster_health;
-- Replication slots status
SELECT slot_name, slot_type, restart_lsn FROM pg_replication_slots;
-- Lag monitoring
SELECT
EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp())) as replication_lag_seconds;
Troubleshooting¶
Q: Replication lagging?¶
A: Check network, increase wal_buffers, reduce replication rate.
Q: Shards unbalanced?¶
A: Run REBALANCE SHARDS or redistribute manually.
Q: Failover not automatic?¶
A: Check auto_failover setting and ensure failure detection is working.
Best Practices¶
-
Monitor Continuously
-
Plan Capacity
-
Test Failover
-
Secure Communication
Next Steps¶
- Read
/docs/guides/user/REPLICATION_OPERATIONS_GUIDE.mdfor operational details - Check sharding strategy guide:
/docs/guides/user/SHARDING_STRATEGY_DECISION_GUIDE.md - Review multi-master guide:
/docs/guides/user/MULTI_MASTER_REPLICATION_OPERATIONS.md
Related Features¶
- High Availability:
/docs/guides/user/HIGH_AVAILABILITY_OPERATIONAL_PROCEDURES.md - Multi-Cloud:
/docs/guides/user/MULTI_CLOUD_FEDERATION_GUIDE.md - Backup & Recovery:
/docs/guides/user/BACKUP_RECOVERY_GUIDE.md
Document Version: 1.0 Last Updated: December 30, 2025 Audience: Database architects, DevOps engineers Reading Time: 10 minutes