Module A-7·20 min read

Full sync vs partial resync (PSYNC), the replication backlog and replica reconnection, INFO replication field breakdown, measuring replication lag, and the write-to-primary-before-replica data loss window.

A-7 — Master-Replica Replication: PSYNC, Replication Buffer, and Lag

Q: A Redis primary handles 50MB/s of write traffic. The `repl-backlog-size` is left at its default value of 1MB. A replica experiences a brief network blip and disconnects for 200 milliseconds before reconnecting. What happens next?

The replica requires a full resync because the primary generated 10MB of writes during the 200ms disconnect, overflowing the 1MB circular backlog buffer. — The replication backlog is a circular buffer. If writes come in faster than the buffer can hold them during a disconnect, the oldest commands are overwritten. In this scenario, 50MB/s = 50,000KB/s. A 200ms disconnect means 10,000KB (10MB) of writes occurred. The 1MB backlog was completely overwritten 10 times over. When the replica reconnects and asks for data from its last known offset, the primary no longer has that data in the backlog, forcing a heavy, expensive full resync (BGSAVE + RDB transfer).

Q: An application is configured to route all read queries to Redis replicas to reduce load on the primary. The team notices occasional bugs where users log in, are immediately redirected to a dashboard, and are then mysteriously logged out and sent back to the login screen. Refreshing the page fixes the issue. What is the most likely cause?

The session token is written to the primary upon login, but the immediate read query to load the dashboard hits a replica that hasn't received the token yet due to asynchronous replication lag. — Because Redis replication is asynchronous, a write to the primary returns `OK` to the client *before* the replica receives the data. If the client immediately makes a read request, and that request is routed to a replica with even a few milliseconds of lag, the replica won't find the session token. The application treats this "not found" as an invalid session and logs the user out. This is why critical coordination or immediate read-after-write state (like sessions or locks) must be read from the primary.

Q: What is the primary operational trade-off when configuring `min-replicas-to-write 1` and `min-replicas-max-lag 10` on a Redis primary?

It trades system availability for increased data durability, preventing the primary from accepting writes (effectively causing downtime for write operations) if it loses connection to its replicas. — By default, Redis prefers availability: the primary will gladly accept writes even if every single replica is down or disconnected. However, if that primary then crashes, all those writes are permanently lost. `min-replicas-to-write` changes this behavior. If no replica has acknowledged a ping within the `max-lag` window, the primary returns an error for all write commands. You gain strong protection against data loss in split-brain or network partition scenarios, but you sacrifice write availability during infrastructure hiccups.

Who this module is for: You are running Redis with replicas for read scaling or failover, but you have not understood what happens inside a replication relationship — how data flows, what happens when a replica reconnects after a disconnect, and when replication lag becomes a data consistency crisis. This module covers the full replication model.

The Replication Model

Redis uses asynchronous replication. The primary processes a write command, sends the command to connected replicas, and returns OK to the client — without waiting for replicas to acknowledge. Replicas apply commands in the order they are received, maintaining an eventually consistent copy of the primary's dataset.

Consequence: There is always a window (typically < 1ms, but potentially much longer under load or network issues) during which a write exists on the primary but not on replicas. If the primary fails in this window, the write is lost.

Setting Up Replication

text

# On the replica (redis.conf):
replicaof 10.0.1.50 6379

# Or at runtime:
REPLICAOF 10.0.1.50 6379

# Promote a replica to primary (disconnect from primary):
REPLICAOF NO ONE

Authentication for replication:

text

# replica redis.conf:
masterauth your-primary-password   → authenticate with the primary
requirepass your-replica-password  → protect the replica itself

Initial Sync: Full Resync

When a replica connects to a primary for the first time (or after a long disconnection), it performs a full resync:

text

Replica → PRIMARY: PSYNC ? -1   (no existing replication ID, no offset)

PRIMARY:
1. Sends: +FULLRESYNC {repl_id} {offset}
2. Calls BGSAVE to create an RDB snapshot
3. Streams the RDB file to the replica
4. While RDB is generating, buffers new write commands in the replication buffer
5. After RDB transfer: sends buffered commands

Replica:
1. Receives RDB file
2. Loads RDB (discarding existing dataset)
3. Applies buffered commands
4. Enters streaming replication mode

Memory impact: During full resync, the primary uses extra RAM for:

The BGSAVE fork + copy-on-write pages
The replication output buffer (buffering commands written during RDB transfer)

A large primary with a high write rate can use 2–3× its normal RAM during a full resync. Size your primary to handle this.

Incremental Resync: PSYNC with Replication ID and Offset

After initial sync, the replica streams commands from the primary continuously. Each command changes the replication offset — a byte count of how much data has been replicated.

Redis assigns each primary a replication ID (a random 40-character hex string). When a replica reconnects after a brief disconnect:

text

Replica → PRIMARY: PSYNC {repl_id} {last_offset}

PRIMARY:
- If repl_id matches AND last_offset is within the replication backlog:
  → Partial resync: stream only the commands since last_offset
  → Much faster than full resync

- If repl_id does not match OR last_offset is too old (not in backlog):
  → Full resync required

The Replication Backlog

The replication backlog is a circular buffer on the primary that holds recent write commands. Its size is configurable:

repl-backlog-size 1mb   → default: 1MB

If a replica disconnects and reconnects within the time it takes to fill the backlog, it can do a partial resync. If the backlog has rolled over (the commands since the disconnect are no longer in the buffer), a full resync is required.

Size your backlog appropriately. At 100MB/s of write throughput, a 1MB backlog fills in 10ms. A replica that disconnects for even 1 second will require a full resync. Set the backlog to at least 60 seconds × write rate in bytes:

text

# For 10MB/s write rate, 60-second backlog:
repl-backlog-size 600mb

This trades memory (the backlog lives in RAM) for resilience to brief replica disconnections.

Monitoring Replication

INFO replication

On the primary:

text

role: master
connected_slaves: 2
slave0: ip=10.0.1.51,port=6379,state=online,offset=84729384,lag=0
slave1: ip=10.0.1.52,port=6379,state=online,offset=84729380,lag=1
master_replid: a3f9c2d7e8b14f2c1d9e8a7b6c5d4e3f
master_repl_offset: 84729384
repl_backlog_active: 1
repl_backlog_size: 1048576
repl_backlog_first_byte_offset: 83680808
repl_backlog_histlen: 1048576

lag — seconds since the replica last sent a REPLCONF ACK. A lag of 0 or 1 is healthy. Growing lag is a warning.

On a replica:

text

role: slave
master_host: 10.0.1.50
master_port: 6379
master_link_status: up
master_last_io_seconds_ago: 0
master_sync_in_progress: 0
slave_read_repl_offset: 84729384
slave_repl_offset: 84729384
slave_priority: 100
slave_read_only: 1
replica_announced: 1

master_link_status: down means the replica is disconnected. master_sync_in_progress: 1 means a full resync is underway.

Measure replication lag in bytes: The offset difference between master_repl_offset and slave_repl_offset is the lag in bytes. Convert to time: lag_bytes / write_rate_bytes_per_second.

Replica Configuration Options

text

# Make replica reject write commands (default: yes)
replica-read-only yes

# Replica serves stale data when disconnected from primary (default: yes)
# 'no' causes replica to return error on all commands when disconnected
replica-serve-stale-data yes

# Replica priority for Sentinel failover (lower = preferred)
replica-priority 100

# Minimum replicas required before primary accepts writes (split-brain prevention)
min-replicas-to-write 1        → require at least 1 replica connected
min-replicas-max-lag 10        → replica must be within 10 seconds of lag

min-replicas-to-write

This setting prevents the primary from accepting writes when it cannot reach enough replicas:

text

min-replicas-to-write 1
min-replicas-max-lag 10

With these settings, if all replicas disconnect (network partition, all replica crashes), the primary stops accepting writes after 10 seconds. This prevents a split-brain scenario where the primary continues writing data that will be lost when the partition heals.

Trade-off: This reduces availability (primary stops during replica outage). For systems where data loss is unacceptable, this is the correct trade. For systems where availability is more important than durability, leave at min-replicas-to-write 0.

Read from Replicas

Route read-heavy, latency-tolerant queries to replicas to reduce primary load:

typescript

// Route reads to a replica
const userProfile = await replicaRedis.get(`user:${userId}:profile`);

// Write always to primary
await primaryRedis.hset(`user:${userId}`, 'lastSeen', Date.now().toString());

What must NOT be read from replicas:

Session tokens (replication lag → "token not found" → false logout)
Lock state (lag → two clients think they hold the lock)
Rate limit counters (lag → allowing more requests than permitted)
Any coordination state where stale data causes incorrect decisions

Safe to read from replicas:

Cached content (product data, blog posts, configuration)
Analytics/reporting queries
Historical data that changes slowly

Replication and Lua Scripts

Lua scripts executed on the primary are replicated to replicas in one of two ways:

Script replication (default until Redis 7.0): The entire EVAL command is replicated. Replicas re-execute the script. Non-deterministic scripts (using TIME, SRANDMEMBER) produce different results on replicas — a data divergence bug.
Effect replication (Redis 3.2+, default in Redis 7.0): Only the write commands issued by the script are replicated. Deterministic regardless of script behaviour.

text

# redis.conf (Redis 7.0+, effect replication is default):
repl-lua-eval-replicas yes   → (deprecated) use lua-replicate-commands instead

# Force effect replication for a specific script:
redis.replicate_commands()   → call at the start of the Lua script

If your Lua scripts use non-deterministic functions, ensure effect replication is enabled to prevent replica divergence.

Diskless Replication

Full resync normally generates an RDB file on disk, then streams it. Diskless replication streams the RDB directly from memory to the replica socket — no disk write:

text

repl-diskless-sync yes           → stream RDB directly (faster for large datasets on slow disks)
repl-diskless-sync-delay 5       → wait 5 seconds for more replicas to connect before starting
repl-diskless-sync-max-replicas 0 → stream to all replicas simultaneously (0 = no limit)

Diskless replication is faster when disk I/O is the bottleneck (many replicas connecting simultaneously, or a slow disk). The memory footprint is similar to disk-based sync (the RDB data must be held in memory to stream it).

Common Replication Problems

Growing Replication Lag

Cause: Replica cannot apply commands as fast as the primary sends them.

Symptoms: slave.lag increases steadily in INFO replication. Replica's slave_repl_offset falls further behind master_repl_offset.

Diagnosis:

Check replica CPU — if at 100%, the replica is compute-bound on command application
Check network bandwidth between primary and replica
Check if the primary is issuing very large commands (MSET with thousands of keys) that take time to apply

Fix: Vertical scaling of the replica, or reduce primary write rate by batching or throttling.

Replica Requires Full Resync After Brief Disconnect

Cause: Replication backlog too small — the commands since the disconnect are no longer in the buffer.

Fix: Increase repl-backlog-size. Calculate required size: max_disconnection_time_seconds × write_rate_bytes_per_second.

Client Reading Stale Data from Replica

Cause: Reading coordination-critical data from a replica with non-trivial lag.

Fix: Route coordination reads to the primary. See P-9 (session management) and A-6 (SupraScan) for the correct pattern.

Summary

Redis replication is asynchronous — primary acknowledges writes before replicas confirm receipt; data loss is possible in the replica lag window
Full resync (PSYNC with ? offset): replica loads an RDB snapshot from the primary — happens on first connect or after falling too far behind
Partial resync (PSYNC with known replication ID + offset): replica catches up from the replication backlog — fast; only possible if offset is still in the backlog
Size repl-backlog-size = max_expected_disconnect_seconds × write_rate — too small causes unnecessary full resyncs
Monitor slave.lag (seconds) and the offset delta (bytes); alert when lag exceeds 5 seconds
min-replicas-to-write prevents writes when replicas are disconnected — choose availability vs durability
Route read-heavy, latency-tolerant queries to replicas; always read coordination state from the primary
Effect replication prevents non-deterministic Lua script divergence on replicas

Next: A-8 — Redis Sentinel: Quorum, Failover, and Split-Brain Prevention — the automatic failover system that promotes a replica to primary and reconfigures clients when the primary goes down.

Knowledge Check

A Redis primary handles 50MB/s of write traffic. The repl-backlog-size is left at its default value of 1MB. A replica experiences a brief network blip and disconnects for 200 milliseconds before reconnecting. What happens next?

An application is configured to route all read queries to Redis replicas to reduce load on the primary. The team notices occasional bugs where users log in, are immediately redirected to a dashboard, and are then mysteriously logged out and sent back to the login screen. Refreshing the page fixes the issue. What is the most likely cause?

What is the primary operational trade-off when configuring min-replicas-to-write 1 and min-replicas-max-lag 10 on a Redis primary?

Test your knowledge with more question sets

PreviousModule A-6: The SupraScan Architecture: Coordinating 10+ Concurrent Scanner Instances Next Module A-8: Redis Sentinel: Quorum, Failover, and Split-Brain Prevention

Discussion

Join the discussion

Loading comments...