Module P-1·20 min read

How BGSAVE uses fork() and copy-on-write, the RDB file format, snapshot scheduling with save directives, RDB compression, and the trade-off: data loss up to the last snapshot interval.

P-1 — RDB Snapshots: Point-in-Time Persistence

Q: A team configures Redis with the persistence rule `save 60 10000`. During a massive traffic spike, their application writes 15,000 keys to Redis in the first 10 seconds of a minute, but the server experiences a total power failure exactly at the 50-second mark. How much of the newly written data survives?

None of it. The server crashed before the 60-second interval was reached, so the snapshot condition was never triggered, and all 15,000 writes are lost. — The `save ` directive means "save if AT LEAST ` ` occurred within the LAST ` `". However, Redis only evaluates these conditions periodically (typically every second). If the server crashes at 50 seconds, the 60-second interval condition hasn't been satisfied yet, so the snapshot was never triggered. This illustrates the primary risk of RDB: your maximum potential data loss is equivalent to your snapshot interval.

Q: An engineer provisions a Redis server with 16GB of RAM. The dataset is 12GB and growing. Nightly `BGSAVE` operations run without issue, but one day, during a massive data import, the server abruptly runs out of memory (OOM) and crashes exactly when a `BGSAVE` triggers. Why did this happen?

`BGSAVE` relies on the OS's copy-on-write (CoW) mechanism. While the parent process continues serving writes, the OS duplicates modified memory pages. Because the data import caused massive writes during the snapshot, CoW consumed the remaining 4GB of RAM, triggering an OOM kill. — RDB snapshots use `fork()` to create a background process. Both processes initially share the exact same physical memory. However, when the parent process handles a write operation, the OS executes a "copy-on-write", duplicating that specific memory page so the parent can modify it without altering the child's point-in-time view. During a massive import, nearly every page might be modified, effectively doubling the dataset's memory requirement (12GB + 12GB = 24GB). Since the server only had 16GB, it OOMed.

Q: What is the consequence of leaving `stop-writes-on-bgsave-error yes` (the default) enabled in a production environment where the disk periodically runs out of space?

Redis will actively reject all new write commands from clients, returning an error, until a background save succeeds. — This configuration acts as a hard safety brake. If Redis is configured to persist data but the disk fails (e.g., out of space, permission errors), the system assumes the operator needs to be notified immediately that data durability is compromised. To prevent the in-memory dataset from diverging completely from the last durable backup, Redis stops accepting write commands (returning a `MISCONF` error) while continuing to serve read requests.

Who this module is for: You know Redis loses data on restart by default and that "persistence" is something you can configure, but you have never understood how it actually works, what the performance cost is, or when RDB is the right choice vs AOF. This module covers the RDB snapshot mechanism end-to-end, including the fork-and-copy-on-write internals that most documentation glosses over.

The Default: No Persistence

When you start Redis with default settings and no redis.conf, persistence is partially enabled via default save directives:

text

save 3600 1    → snapshot if at least 1 key changed in the last 3600 seconds
save 300 100   → snapshot if at least 100 keys changed in the last 300 seconds
save 60 10000  → snapshot if at least 10000 keys changed in the last 60 seconds

For a pure cache (Redis as a read-through cache in front of PostgreSQL), you should disable this:

save ""   → disable all automatic snapshots

For any use case where data must survive a restart (session store, rate limiter state, queues), you must configure persistence explicitly — either RDB, AOF, or both.

What RDB Is

RDB (Redis Database Backup) is a point-in-time snapshot of the entire Redis dataset, written to a single compact binary file on disk. By default, this file is named dump.rdb.

When Redis restarts, it reads dump.rdb and restores the entire dataset into memory. Startup time is proportional to the size of the dataset.

The guarantee: All data that existed at the moment the snapshot was taken survives a restart. Data written after the snapshot and before the crash is lost.

The trade-off: The maximum data loss equals the time between snapshots. With save 60 10000, you can lose up to 60 seconds of writes.

How BGSAVE Works: Fork and Copy-on-Write

This is the part that most tutorials skip, and it is the most important thing to understand about RDB.

Redis cannot stop serving requests while it writes the snapshot — that could take minutes for a large dataset. Instead, it uses the Unix fork() system call to create a child process that handles the snapshot while the parent continues serving clients.

text

Redis parent process (serving requests)
        │
        ├── BGSAVE triggered
        │
        └── fork()
              │
              ├── Parent process ──────────────────→ continues serving clients
              │   (still has all data in memory)      (writes go to parent's memory)
              │
              └── Child process ───────────────────→ writes entire dataset to disk
                  (sees a snapshot of parent's       (reads from shared pages)
                   memory at fork time)

Copy-on-Write (CoW)

After fork(), both the parent and child share the same physical memory pages. The OS uses copy-on-write semantics: when the parent modifies a page (because a client writes to Redis), the OS creates a private copy of that page for the parent. The child still sees the original page.

This means:

The child writes the snapshot based on the data as it existed at fork time — a consistent point-in-time view
The parent continues serving writes normally
Modified pages are duplicated in RAM — one version for the parent (new value) and one for the child (old value, being written to disk)

The memory spike: During a snapshot, Redis can temporarily use up to 2× its normal working set of RAM. If your dataset is 4GB and writes are heavy during the snapshot, Redis might use 6-7GB before the snapshot finishes and the child exits.

This is why you must size your Redis server's RAM to accommodate BGSAVE overhead. A rule of thumb: provision 1.5× your expected dataset size for a write-heavy Redis.

Fork Latency

fork() itself is a fast system call — it does not copy memory, only the page table. But on large datasets, the page table itself can be large. Redis may pause briefly (milliseconds) while the OS copies the parent's page table for the child.

With 10GB dataset and 2MB pages (default on Linux), the page table has ~5,000 entries. With 4KB pages (common default), it has ~2.5 million entries. Large page tables mean longer fork() latency.

To check fork latency:

text

INFO stats
→ latest_fork_usec: 512   ← last fork took 512 microseconds

Values above 10,000 microseconds (10ms) indicate a very large dataset or a system with memory pressure.

Triggering Snapshots

Automatic (via save directives)

text

# In redis.conf
save 3600 1
save 300 100
save 60 10000

Redis checks these conditions every second. If any threshold is met, BGSAVE is triggered automatically. Conditions are OR'd — meeting any one triggers a snapshot.

Manual

text

BGSAVE              → async: triggers background save, returns immediately
SAVE                → sync: blocks Redis until snapshot is complete (avoid in production)
BGSAVE SCHEDULE     → queue a BGSAVE when no other save is running

SAVE is a blocking command — no client can be served while it runs. Only use it when you need a guaranteed consistent snapshot before a maintenance operation and can tolerate the downtime.

On Shutdown

Redis calls BGSAVE automatically on graceful shutdown (SIGTERM). The data written during the shutdown window between the snapshot and the shutdown is lost. If Redis is killed (SIGKILL) or crashes, no shutdown snapshot is taken.

RDB File Format and Compression

The dump.rdb file is a compact binary format that stores:

A header with the Redis version
Database select markers
Each key-value pair with:
- Key TTL (if set)
- Value type byte
- Encoded value (type-specific encoding)
A checksum (CRC64)

RDB uses LZF compression for large String values by default (rdbcompression yes). Compression reduces file size but adds CPU cost during save and load.

text

# In redis.conf
rdbcompression yes        → compress string objects (default)
rdbchecksum yes           → validate checksum on load (default)
dbfilename dump.rdb       → snapshot filename
dir /var/lib/redis        → directory for snapshot and AOF files

Checking RDB File Validity

bash

redis-check-rdb /var/lib/redis/dump.rdb

This verifies the RDB file's checksum and structural integrity — useful before restoring a backup.

Configuring RDB in Practice

Pure Cache (disable RDB)

save ""

Infrequent Snapshot (low write rate, tolerate data loss)

save 3600 1    → once an hour if anything changed

Frequent Snapshot (moderate write rate)

text

save 900 1     → every 15 minutes if at least 1 change
save 300 10    → every 5 minutes if at least 10 changes

Stop-on-Failure Behavior

stop-writes-on-bgsave-error yes   → default: refuse writes if last BGSAVE failed

This is a safety mechanism: if Redis cannot write the snapshot (disk full, permission error), it stops accepting writes to prevent data divergence between the in-memory state and the last successful snapshot. In production, monitor for BGSAVE errors — this config will silently start rejecting write commands.

Monitoring RDB

INFO persistence

Key fields:

text

rdb_changes_since_last_save: 1423     → writes since last snapshot
rdb_bgsave_in_progress: 0            → 1 if currently saving
rdb_last_save_time: 1717000000       → Unix timestamp of last successful save
rdb_last_bgsave_status: ok           → "ok" or "err"
rdb_last_bgsave_time_sec: 3          → seconds for last snapshot
rdb_current_bgsave_time_sec: -1      → seconds elapsed for in-progress save (-1 if none)
rdb_saves: 42                         → total successful saves since start
rdb_last_cow_size: 8388608           → copy-on-write memory used during last save (bytes)

rdb_last_cow_size tells you how much extra memory was needed for copy-on-write during the snapshot. If this number is approaching your free RAM, you have a memory sizing problem.

RDB vs AOF at a Glance

Concern	RDB	AOF
Data loss on crash	Up to snapshot interval	Typically ≤ 1 second (with `everysec`)
Recovery speed	Fast (load binary)	Slow (replay all commands)
File size	Compact	Grows until rewrite
Disk I/O during normal operation	Only at snapshot intervals	Continuous (every command)
Performance impact	Short fork pause, CoW memory spike	Minimal (append-only write)
Best for	Backups, tolerable data loss	Near-durability requirement

The full comparison and decision framework are in P-3. P-2 covers AOF in depth.

Summary

RDB creates a point-in-time snapshot of the entire dataset using fork() + copy-on-write
The child process writes the snapshot; the parent continues serving clients — no blocking
During BGSAVE, Redis can use up to 2× its dataset size in RAM due to copy-on-write page duplication
fork() itself may pause Redis for milliseconds on large datasets — monitor latest_fork_usec
Configure save directives to control automatic snapshot frequency; BGSAVE for manual snapshots
Maximum data loss = time since last successful snapshot
Monitor with INFO persistence: watch rdb_last_bgsave_status and rdb_last_cow_size
stop-writes-on-bgsave-error yes (default) stops all writes if BGSAVE fails — monitor for this in production

Next: P-2 — AOF: Append-Only File Mechanics and fsync Strategies — how Redis logs every write command to disk, the three fsync modes and their durability-vs-latency trade-offs, and how AOF rewrite prevents unbounded log growth.

Knowledge Check

A team configures Redis with the persistence rule save 60 10000. During a massive traffic spike, their application writes 15,000 keys to Redis in the first 10 seconds of a minute, but the server experiences a total power failure exactly at the 50-second mark. How much of the newly written data survives?

An engineer provisions a Redis server with 16GB of RAM. The dataset is 12GB and growing. Nightly BGSAVE operations run without issue, but one day, during a massive data import, the server abruptly runs out of memory (OOM) and crashes exactly when a BGSAVE triggers. Why did this happen?

What is the consequence of leaving stop-writes-on-bgsave-error yes (the default) enabled in a production environment where the disk periodically runs out of space?

Test your knowledge with more question sets

PreviousModule F-12: Caching Patterns: Cache-Aside, Write-Through, Write-Behind, and Read-Through Next Module P-2: AOF: Append-Only File Mechanics and fsync Strategies

Discussion

Join the discussion

Loading comments...