← Courses

Documentation Course

Redis In-Depth

From Runtime Basics to Distributed Architecture at Scale

Start Reading

A complete A-Z Redis curriculum — from the in-memory model and data structures for beginners through caching patterns, persistence, and Streams for mid-level engineers to Lua scripting, distributed locking algorithms, and Cluster topology for senior engineers. Built from production experience coordinating 10+ concurrent scanner instances at 2,000+ TPS on a live blockchain indexer.

39 modulesBeginner to AdvancedWritten / Documentation-style39 of 39 modules live

How to use this course

This course works as both a sequential read and a standalone reference. Read front-to-back to build a complete mental model of Redis — from the in-memory model to Cluster topology. Or jump to any module when you hit a production issue — a cache stampede, an eviction surprise, a distributed lock bug, or a replication lag incident.

Total reading time

~15 hrs

across 39 modules

Built from

2,000+ TPS

10+ concurrent scanner instances

Prerequisite

Zero Redis to production internals

Beginner through senior engineer

Phase 1 — Foundation

Absolute basics · No prior knowledge assumed

12 modules
F-1
What Is Redis and Why Does It Exist?
The problem Redis solves vs relational databases, the in-memory model, Redis vs Memcached, installing Redis, and your first key-value operations — zero prior knowledge assumed.
F-2
Strings, Numbers, and Binary Safety
Why Redis Strings are not strings — they are binary-safe byte arrays. Integer encoding, atomic INCR/DECR, the GET/SET command family, MGET/MSET for batching, and why storing JSON blobs is the first mistake engineers make.
F-3
Lists, Hashes, Sets, and Sorted Sets — Internal Encodings
The full data structure surface and how each uses listpack vs ziplist vs skiplist vs hashtable encoding under the hood — what triggers encoding upgrades and how encoding determines memory usage.
F-4
Redis's Single-Threaded Event Loop — Speed, Limits, and Multithreading
How the I/O multiplexing event loop handles 100K+ RPS on a single thread, what Redis 6+ multithreading actually parallelises (I/O, not command execution), and the one failure mode nobody warns you about: one slow command blocks every client.
F-5
TTL, Expiry, and Eviction
How Redis expires keys via lazy and active expiry, the seven eviction policies (allkeys-lru, volatile-ttl, noeviction and more), maxmemory configuration, and how to monitor evictions with INFO stats.
F-6
HyperLogLog, Bitmaps, and Geospatial
Probabilistic cardinality estimation with PFADD/PFCOUNT at constant 12KB, per-user boolean tracking at scale with SETBIT/BITCOUNT, and location-aware radius queries with GEOADD/GEOSEARCH.
F-7
Pipelining and the RESP Protocol
Why sequential redis.get() in a loop is the most common Redis performance mistake, what RESP looks like on the wire, how pipelining eliminates per-command RTT, and when to use MGET/MSET vs explicit pipelines.
F-8
Pub/Sub and the Message Fanout Model
SUBSCRIBE, PUBLISH, PSUBSCRIBE for pattern channels, the no-persistence guarantee, keyspace notifications, horizontal scaling across app servers, and when to use Streams instead of Pub/Sub.
F-9
Streams: Append-Only Logs and Consumer Groups
XADD/XREAD/XRANGE, the XREADGROUP consumer group model, XACK and the Pending Entry List, XAUTOCLAIM for crash recovery, dead-letter handling, and when Redis Streams beats Kafka or BullMQ.
F-10
Memory Layout and Object Encoding Internals
The robj structure, SDS strings, jemalloc size classes, memory fragmentation ratio, active defragmentation, MEMORY USAGE per key, and practical strategies for cutting Redis memory usage by 50%.
F-11
Transactions: MULTI, EXEC, and Optimistic Locking with WATCH
What Redis transactions actually guarantee, why runtime errors do not roll back, WATCH-based optimistic locking for read-modify-write patterns, retry loops, and when WATCH contention demands Lua scripts instead.
F-12
Caching Patterns: Cache-Aside, Write-Through, Write-Behind, and Read-Through
The four caching strategies with their consistency guarantees, failure modes, and implementation trade-offs. Cache stampede prevention, invalidation timing, cache warm-up, and the decision framework for choosing the right pattern.

Phase 2 — Practitioner

Production patterns · Real application architecture

12 modules
P-1
RDB Snapshots: Point-in-Time Persistence
How BGSAVE uses fork() and copy-on-write, the RDB file format, snapshot scheduling with save directives, RDB compression, and the trade-off: data loss up to the last snapshot interval.
P-2
AOF: Append-Only File Mechanics and fsync Strategies
How AOF logs every write command, the three appendfsync strategies (always/everysec/no) and their durability-vs-latency trade-offs, AOF rewrite to prevent unbounded log growth, and the hybrid RDB+AOF preamble format.
P-3
Persistence Decision Framework: RDB vs AOF vs Both vs None
The decision matrix: RDB for snapshots with acceptable data loss, AOF for near-durability, both for maximum recovery, none for pure caches. Recovery time estimates, storage overhead, and when managed Redis changes the calculus.
P-4
Memory Profiling and Optimization
INFO memory field-by-field, MEMORY USAGE and MEMORY DOCTOR, scanning for oversized keys, encoding threshold tuning, active defragmentation configuration, and a production workflow for diagnosing unexpected memory growth.
P-5
Atomic Counters, Rate Limiters, and Sliding Windows
INCR/INCRBY as lock-free counters, fixed-window rate limiter with INCR+EXPIRE, sliding window using Sorted Sets with timestamp scores, token bucket algorithm, and the off-by-one race in fixed-window that Lua eliminates.
P-6
BullMQ Internals: The Redis Data Structures Behind the Job Queue
How BullMQ maps job lifecycle to Sorted Sets, Lists, and Hashes. Worker polling, delayed job scheduling, stalled job detection via heartbeat, the rate limiter internals, and choosing BullMQ vs raw Streams.
P-7
Cache Stampede, Avalanche, and Penetration
Three distinct failure modes requiring different solutions — XFetch probabilistic expiry for stampede, TTL jitter for avalanche, Bloom filter pre-gating for penetration. Detection patterns and production mitigations.
P-8
Keyspace Notifications and Event-Driven Architectures
notify-keyspace-events configuration, the event type matrix (key expiry, deletion, Set/List/Hash writes), subscribing to expiry for cache warming, and the critical limitation: keyspace notifications are at-most-once.
P-9
Session Management Patterns
Storing sessions as Hashes vs JSON strings, sliding expiry with EXPIRE on each request, session invalidation and logout, multi-device session tracking, and the consistency trade-offs when reading sessions from replicas.
P-10
Connection Pooling and Client Configuration
TCP connection overhead, ioredis connection pool sizing, reconnection strategies with exponential backoff, command timeout configuration, lazy connect vs eager connect, and health check patterns for production clients.
P-11
Monitoring and Observability
The INFO command section by section (server, clients, memory, stats, replication, keyspace), SLOWLOG for identifying slow commands, LATENCY HISTORY, MONITOR for live command tracing, and the 10 metrics every Redis dashboard must have.
P-12
Security: ACLs, TLS, and Network Hardening
Redis ACL rules for per-user command and key restrictions, TLS configuration for in-transit encryption, bind address and protected mode, requirepass and its limitations, and the most common Redis security misconfigurations.

Phase 3 — Architect

Engine internals · High-throughput systems · Distributed architecture

15 modules
A-1
Single-Instance Locking: SET NX PX and Lock Correctness
SET key value NX PX as the atomic lock primitive, UUID lock values to prevent accidental release, lock extension with conditional PEXPIRE, the critical GC-pause failure mode, and why distributed locks need fencing tokens.
A-2
Lua Scripting: EVAL, EVALSHA, and Atomic Compound Operations
Why Lua scripts execute atomically, KEYS and ARGV conventions, redis.call() vs redis.pcall(), SCRIPT LOAD and EVALSHA for script caching, atomic rate limiters and conditional operations impossible without Lua.
A-3
Redis Functions: Persistent Stored Procedures
How Redis Functions differ from Lua scripts — functions persist across restarts in RDB/AOF. Function libraries, the shebang declaration, registering multiple functions in one library, replication semantics, and migration from EVALSHA.
A-4
Redlock: The Algorithm, Its Guarantees, and Its Critics
The Redlock algorithm step by step across N independent Redis instances, what it guarantees under bounded clock drift, what Martin Kleppmann's critique gets right, fencing tokens as the correct complement, and implementation with redlock-node.
A-5
Reentrant Locks, Hierarchies, and Deadlock Prevention
Reentrant locking with Hash-stored reentry counters, lock hierarchies and consistent ordering to prevent circular waits, the watchdog pattern for automatic lock extension, and distinguishing mutex locks from counting semaphores.
A-6
The SupraScan Architecture: Coordinating 10+ Concurrent Scanner Instances
The real-world problem: 10+ multithreaded Node.js instances processing Kafka-delivered blockchain blocks at 2,000+ TPS without double-processing. Block-range partitioning via Redis locks, heartbeat extension, crash recovery, and the 6-hour replication lag incident.
A-7
Master-Replica Replication: PSYNC, Replication Buffer, and Lag
Full sync vs partial resync (PSYNC), the replication backlog and replica reconnection, INFO replication field breakdown, measuring replication lag, and the write-to-primary-before-replica data loss window.
A-8
Redis Sentinel: Quorum, Failover, and Split-Brain Prevention
Sentinel as an independent high-availability process, subjective down vs objective down, the failover election sequence, min-replicas-to-write for split-brain prevention, and what Sentinel cannot protect against.
A-9
Redis Cluster: Hash Slots and Data Distribution
The 16,384 hash slot model, CRC16 key hashing, hash tags for co-locating keys on the same slot, MOVED vs ASK redirections, multi-key command constraints in Cluster, and cluster-enabled client configuration.
A-10
Resharding, Node Addition, and Live Slot Migration
Adding a node to a running cluster, assigning slots, CLUSTER SETSLOT MIGRATING/IMPORTING, the MIGRATE command, zero-downtime resharding with ASKING redirections, and why resharding time scales with key count.
A-11
Gossip Protocol and Network Partition Handling
How cluster nodes propagate topology changes via PING/PONG gossip, cluster-node-timeout in failure detection, split-brain handling with cluster-require-full-coverage, and when a Cluster sacrifices availability for consistency.
A-12
Multi-Region Redis: Active-Active and Geo-Replication
Active-active (CRDT-based) vs active-passive multi-region Redis, Redis Enterprise geo-replication, conflict resolution strategies, latency-vs-consistency trade-offs, and when global distribution is the wrong answer.
A-13
Disaster Recovery, Backup, and Point-in-Time Restore
RDB backup scheduling, AOF log shipping to cold storage, BGSAVE + BGREWRITEAOF interaction, DEBUG RELOAD for in-memory consistency checks, and a documented recovery runbook for the three most common failure scenarios.
A-14
Performance Benchmarking and Production Tuning
redis-benchmark usage and interpreting results, OS-level tuning (transparent huge pages, TCP backlog, ulimit), slowlog analysis, latency percentile monitoring, and the 5 configuration changes that meaningfully improve throughput.
A-15
Topology Decision Tree: Standalone, Sentinel, or Cluster
The decision framework: standalone for development and tolerable restart downtime, Sentinel for automatic failover with a single-node dataset, Cluster for horizontal scaling past single-node RAM. Operational cost of each and when managed Redis wins.

© 2026 Jatin Jain Saraf (JJS). All rights reserved.