Module A-15·18 min read

The decision framework: standalone for development and tolerable restart downtime, Sentinel for automatic failover with a single-node dataset, Cluster for horizontal scaling past single-node RAM. Operational cost of each and when managed Redis wins.

A-15 — Topology Decision Tree: Standalone, Sentinel, or Cluster

This is the final module of the Architect tier. Every module in this course has built toward this decision. You now have the vocabulary, the mental models, and the production experience to answer the question every Redis architect eventually faces: what topology should I deploy? This module synthesises the course into a practical decision framework — not a flowchart, but a reasoned guide based on your actual requirements.


The Three Topologies

Standalone Redis

A single Redis primary with no high-availability configuration. The simplest deployment.

App → Redis Primary

What you get:

  • Zero operational complexity
  • Full command support (no Cluster restrictions)
  • Up to ~1M ops/sec on modern hardware
  • Dataset limited to one machine's RAM

What you don't get:

  • Automatic failover (Redis down = application down until manual recovery)
  • Data distribution beyond one node

Redis Sentinel

A primary with one or more replicas, monitored by 3+ Sentinel processes. Automatic failover on primary failure.

App → (Sentinel-discovered) Redis Primary ← Replicas
         ↑
    Sentinel × 3

What you get:

  • Automatic failover (10–30 seconds of write downtime on primary failure)
  • Read scaling via replicas
  • Full command support (no Cluster restrictions on multi-key ops)
  • Dataset limited to one primary's RAM

What you don't get:

  • Horizontal write scaling
  • Dataset scaling beyond one machine's RAM

Redis Cluster

Data sharded across multiple primary nodes, each with replicas. Built-in failover.

App → Cluster-aware client
         ↓
  ┌─────────────────────┐
  │ Node 1  │ Node 2  │ Node 3 │  (primaries)
  │ slots   │ slots   │ slots  │
  │ 0-5460  │5461-    │10923-  │
  │         │10922    │16383   │
  └─────────────────────┘
  Replicas for each primary

What you get:

  • Horizontal write scaling (add nodes to add capacity)
  • Dataset scaling beyond single-node RAM
  • Built-in automatic failover per shard

What you don't get:

  • Unrestricted multi-key operations (keys on different slots require hash tags)
  • SELECT for multiple databases (database 0 only)
  • Simplicity — significantly more operational complexity

The Decision Framework

Work through these questions in order. Stop when you find your answer.

Question 1: Is Redis purely a cache where data loss is acceptable?

Yes → Use Standalone with no persistence.

The simplest deployment. If Redis goes down, your application falls back to the database, the cache re-populates on the next request, and you move on. The cost of Redis downtime is slower application response, not data loss.

This covers the majority of Redis use cases. Most Redis deployments are caches.


Question 2: Does your dataset fit in one machine's RAM?

64GB of application data → 64GB primary + 64GB replica → needs a machine with ≥ 128GB RAM for safe BGSAVE operation

A rough formula: machine RAM ≥ 1.5× dataset size (to handle BGSAVE CoW overhead).

Yes, it fits → continue to Question 3.
No, it does not fit → strong signal for Cluster (skip to Question 5).


Question 3: Can you tolerate manual failover?

Manual failover means: primary crashes → you get paged → you promote a replica with REPLICAOF NO ONE → you update application config → application reconnects. Typical duration: 5–30 minutes depending on on-call response time.

Yes, manual failover is acceptable (development, staging, non-critical production) → Standalone with a manually-managed replica for backups.

No, you need automatic failover → continue to Question 4.


Question 4: Do you need unrestricted multi-key operations?

If your application relies on MULTI/EXEC across arbitrary keys, Lua scripts accessing keys from different hash slots, or complex MGET/MSET patterns — Cluster's hash-slot restriction is a significant constraint.

Yes, you need unrestricted multi-key opsSentinel.

Sentinel gives you automatic failover with full command support and no Cluster restrictions.

No, you can work with hash tags and single-slot constraintsSentinel is still simpler unless you also need horizontal scaling.


Question 5: Do you need horizontal write scaling?

Can your write workload be handled by a single Redis node (up to ~500K writes/second)?

Yes, single node is sufficientSentinel.

Sentinel handles automatic failover, read scaling via replicas, and full command support. It is simpler to operate than Cluster and appropriate for the vast majority of production Redis deployments.

No, you need to distribute writes across multiple nodesCluster.


Question 6 (Cluster path): Can your application keys be co-located with hash tags?

If you answered "yes" to horizontal scaling need and are considering Cluster:

Can all related keys be accessed in the same hash slot via {tag} naming?

If your application uses Lua scripts or transactions that span multiple unrelated key namespaces, Cluster's cross-slot restriction may require significant refactoring.

Yes, you can adopt hash tagsCluster — proceed with hash tag design before implementing.

No, cross-slot operations are fundamental → Consider:

  • Refactoring to use Cluster-compatible patterns (worth it if you genuinely need horizontal scale)
  • Managed Redis with larger node sizes (delay Cluster adoption)
  • A different distributed cache architecture

Decision Summary Table

RequirementTopologyNotes
Pure cache, data loss OKStandaloneNo persistence needed
Auth data, tolerate manual failoverStandalone + replicaManual promotion procedure
Auto-failover, fits in single node RAMSentinelMost production Redis deployments
Auto-failover, read scalingSentinel + replicasStandard production setup
Dataset > single node RAMClusterHash tag design required
Write throughput > 500K/secClusterRare — most need Sentinel
Multi-region, low write latencyRedis Enterprise Active-ActiveNot OSS Redis

The Case for Managed Redis

Before deciding between Sentinel and Cluster, evaluate whether a managed Redis service simplifies your decision:

AWS ElastiCache:

  • Cluster mode disabled: Sentinel-equivalent — multi-AZ failover, read replicas, automated failover in < 60 seconds
  • Cluster mode enabled: Redis Cluster — add shards for horizontal scaling
  • Handles all infrastructure, patching, and failover orchestration

Google Cloud Memorystore:

  • Standard tier: primary + replica, automated failover
  • Cluster: full Redis Cluster support (Preview as of 2024)

Redis Cloud (Redis Ltd.):

  • Full Redis Enterprise capabilities — active-active geo-distribution, modules, higher availability SLAs
  • Premium pricing but managed operations

Recommendation: For most production deployments, use a managed service. The operational cost of running Sentinel or Cluster yourself (monitoring, patching, failover testing, backup management) is significant. Managed services commoditise this work.


The Operational Cost of Cluster

Before choosing Cluster, be honest about the operational overhead:

  • Monitoring: 3+ primaries + 3+ replicas = 6+ nodes to monitor, alert on, and maintain
  • Resharding: adding nodes requires careful slot migration planning
  • Upgrades: rolling upgrades across a cluster require a defined procedure
  • Debugging: cross-slot errors, gossip convergence delays, partial cluster failures
  • Development parity: developers need a local Cluster setup or a proxy that simulates Cluster behaviour

For teams without dedicated Redis operational expertise, the jump from Sentinel to Cluster often results in prolonged incidents that a managed service would have handled automatically.

The question is not "can we run Cluster?" but "should we run Cluster, or should we use a managed service that runs it for us?"


Putting It All Together

Here is the honest synthesis of this course:

Redis is an extraordinarily capable tool. The primitives — atomic counters, pub/sub, streams, Lua scripts, sorted sets — are elegant solutions to problems that would require significant infrastructure elsewhere.

Redis is also easy to misuse. The most common mistakes are not in the code — they are architectural: using it as a primary database without understanding durability trade-offs, ignoring eviction under memory pressure, not planning for the coordination state it holds, and treating "it's on Redis" as synonymous with "it's reliable."

The engineers who use Redis most effectively understand two things:

  1. Redis's mental model — that every operation is a data structure manipulation, that memory is the primary resource, that the single-threaded event loop is both its strength and its constraint.

  2. The distributed systems context — that Redis is one component in a larger system, that its persistence guarantees are weaker than a database, that replication lag is real, and that locks are best-effort coordination, not guarantees.

You now have both. Every module in this course has added a layer to this understanding. Use it well.


Course Complete

Foundation tier (F-1 through F-11): Data structures, memory model, expiry, eviction, HyperLogLog, Bitmaps, Geo, pipelining, Pub/Sub, Streams, memory internals, transactions, caching patterns.

Practitioner tier (P-1 through P-13): RDB persistence, AOF persistence, persistence decision framework, memory profiling, atomic operations, BullMQ internals, cache failure modes, keyspace notifications, session management, connection pooling, monitoring, security, the event loop.

Architect tier (A-1 through A-15): Distributed locking, Lua scripting, Redis Functions, Redlock, advanced lock patterns, the SupraScan production case study, replication, Sentinel, Cluster, resharding, gossip, multi-region, disaster recovery, performance tuning, topology decisions.

What comes next for you:

  • Apply these concepts to a real system under load — theory becomes intuition through production experience
  • Contribute to open-source Redis tooling (the ecosystem of client libraries, monitoring exporters, and operational tools)
  • Read the Redis source code — it is well-written C and the best documentation of the internals
  • Follow antirez (Redis creator) and the Redis maintainers' writeups — the design decisions behind each feature are illuminating

Redis rewards deep understanding. You have the foundation. Now build something with it.

© 2026 Jatin Jain Saraf (JJS). All rights reserved.