Active-active (CRDT-based) vs active-passive multi-region Redis, Redis Enterprise geo-replication, conflict resolution strategies, latency-vs-consistency trade-offs, and when global distribution is the wrong answer.
A-12 — Multi-Region Redis: Active-Active and Geo-Replication
Who this module is for: You serve users across multiple geographic regions and need Redis to be close to each user — low-latency reads and writes from any region. This module covers the two multi-region Redis architectures (active-passive and active-active), the CRDT-based conflict resolution that enables active-active, and when global Redis distribution creates more problems than it solves.
Why Multi-Region Redis
A Redis instance in us-east-1 adds 120ms+ of round-trip time for a user in ap-southeast-1 — completely negating Redis's sub-millisecond advantage over a database. For globally distributed applications, you need Redis deployed close to each user group.
Two architectural options:
Active-Passive: One region writes, other regions read (with replication lag)
Active-Active: All regions write; changes propagate and conflicts are resolved
Active-Passive: Regional Read Replicas
The simplest multi-region setup: the primary Redis is in one region; replicas in other regions serve reads with some lag.
us-east-1 (Primary): accepts all writes
│
│ async replication (100–200ms cross-region latency)
▼
eu-west-1 (Replica): serves reads (100–200ms behind primary)
ap-southeast-1 (Replica): serves reads (150–250ms behind primary)
Read path: Applications in eu-west-1 read from the local replica with < 1ms latency.
Write path: All writes go to us-east-1. An API call from ap-southeast-1 that writes data adds 150–250ms of round-trip time to reach the primary. Unacceptable for write-heavy workloads.
Failover: If the primary fails, you manually promote a replica using REPLICAOF NO ONE and update your application's connection strings. Sentinel can automate this within a single region but not across regions (network latency makes cross-region failure detection unreliable).
When active-passive is appropriate:
- Read-heavy workloads where most data is read many times and written rarely
- Data that is written in one primary region (user-generated content flow is unidirectional)
- Cache workloads where cross-region write latency is acceptable
Active-Active: Every Region Writes Locally
In active-active mode, each regional Redis instance accepts writes. Changes propagate to other regions asynchronously. When two regions write to the same key simultaneously, conflict resolution determines the outcome.
us-east-1 (Primary 1): accepts writes from US users
eu-west-1 (Primary 2): accepts writes from EU users
ap-southeast-1 (Primary 3): accepts writes from APAC users
│ │ │
└───────────┴───────────┘
bidirectional replication
(100–300ms cross-region)
Write path: Each region writes locally with < 1ms latency. The write propagates to other regions asynchronously.
Conflict scenario:
T=0ms: US user sets key "product:999:stock" to 5
T=0ms: EU user simultaneously sets key "product:999:stock" to 3
T=200ms: US change arrives in EU, EU change arrives in US
→ What is the final value? 5? 3? Something else?
CRDTs: Conflict-Free Replicated Data Types
CRDTs (Conflict-Free Replicated Data Types) are data structures designed so that concurrent operations from different replicas can be merged without conflicts — regardless of order.
Redis Enterprise Geo-Distribution (and Redis Cloud) implements CRDTs for Redis data types:
| Redis Type | CRDT Semantics |
|---|---|
| String | Last-Write-Wins (LWW) based on logical timestamp |
| Counter (INCR) | All increments are summed across regions |
| Set | All SETs are merged; DELs use timestamps |
| Sorted Set | All ZADDs are merged; conflicts resolved by LWW |
| Hash | Field-level LWW — different fields can come from different regions |
| List | Append-only merging; ordering by logical timestamp |
Last-Write-Wins (LWW)
The write with the highest logical timestamp (Lamport clock or hybrid logical clock) wins. For String types:
US at T=1000: SET product:999:stock "5"
EU at T=1001: SET product:999:stock "3" (slightly later timestamp)
→ EU's write wins: final value = "3"
LWW is simple but has a failure mode: if two writes happen at nearly the same time (within clock synchronisation tolerance), the winner is determined by clock drift — arbitrary from the application's perspective.
Counter Convergence
For INCR/DECR, CRDTs sum all increments from all regions:
US: INCRBY inventory 100
EU: INCRBY inventory 50
APAC: DECRBY inventory 30
→ Convergent value: 100 + 50 - 30 = 120 (regardless of order or timing)
This is the correct semantics for distributed counters — page views, inventory adjustments, score increments.
Set Merge Semantics
For Sets, the CRDT merges all adds and respects the "observed-remove" rule — a delete only removes adds that the deleting replica has observed:
US: SADD tags "redis" → {redis}
EU: SADD tags "postgres" → {postgres}
(changes propagate)
Final: {redis, postgres} ← union
US: SREM tags "redis" → removes the "redis" add that US observed
EU: SADD tags "redis" → adds "redis" again simultaneously
(changes propagate)
Final: {redis, postgres} ← EU's add survives because US only removed its own add
This can produce unintuitive results (deletes don't propagate as expected), but it ensures convergence.
Redis Enterprise vs Redis OSS for Active-Active
Open-source Redis does not natively support active-active geo-replication. The replication system is designed for primary-replica, not peer-to-peer.
Redis Enterprise (commercial product from Redis Ltd.) provides:
- Active-Active geo-distribution with CRDT semantics
- Automatic conflict resolution
- Global keyspace — all regions share the same logical database
- WAN-optimised replication (delta compression, bandwidth throttling)
Redis Cloud (managed Redis Enterprise) is the SaaS offering.
Alternatives for OSS Redis:
- Roshi (Twitter's approach): application-layer CRDT using Sorted Sets with timestamp scores; reads perform a merge of multiple region results
- Application-level coordination: accept that writes go to one authoritative region and reads may be stale in other regions
- Consistent hashing + per-region primary: different keys are "owned" by different regions; cross-region reads accept the latency
The Consistency Trade-off
Active-active replication is eventually consistent. Between when a write is applied in one region and when it propagates to all others (100–300ms for cross-region), different users see different values:
US user writes: SET username:1001 "Jatin"
150ms later, EU user reads: GET username:1001
→ EU replica: "OldName" (replication hasn't arrived yet)
For some use cases this is acceptable:
- Product recommendations (briefly stale is fine)
- Feature flags (a user in EU and a user in US seeing different states for 200ms is acceptable)
- Leaderboards (eventual consistency is expected)
For some it is not:
- Account balances (EU and US must agree on the balance)
- Inventory counts (two regions cannot both sell the last item)
- Sessions (a session created in the US must be immediately valid in EU)
For strong consistency across regions: use a database with cross-region transactions (CockroachDB, Spanner, YugabyteDB), not Redis.
Practical Patterns for Multi-Region Redis Without Active-Active
Regional Primary with Global Read Replicas
typescript// Each region has its own Redis primary for local writes // Plus read replicas of all other regions' primaries // When user writes (US user): await usRedis.set(`user:${id}:profile`, data); // When user reads in EU (accept small lag): const profile = await euRedis.get(`user:${id}:profile`); // Falls back to US primary if EU replica hasn't received the write if (!profile) { const profileFromUS = await usRedis.get(`user:${id}:profile`); return profileFromUS; }
Sticky Sessions by Region
Route each user's requests to their "home region" for consistent reads and writes. Use geolocation at the CDN/load balancer layer:
US users → us-east-1 Redis (reads and writes)
EU users → eu-west-1 Redis (reads and writes)
# Cross-region migration: replicate with lag; brief inconsistency on region change
Read-Your-Own-Writes via Sticky Routing
After a write, route subsequent reads to the same region (where the write is definitely present) for the next few seconds:
typescriptasync function write(key: string, value: string, userId: string) { await redis.set(key, value); // Tag this user as "just wrote" for 5 seconds await redis.set(`write-tag:${userId}`, '1', 'EX', 5); } async function read(key: string, userId: string) { const justWrote = await redis.exists(`write-tag:${userId}`); if (justWrote) { // Route to primary to read our own write return primaryRedis.get(key); } // Route to local replica (possibly stale, but user hasn't written recently) return localReplicaRedis.get(key); }
Summary
- Active-passive: one primary region, read replicas in other regions — low write latency only in the primary region
- Active-active: every region writes locally — requires CRDT conflict resolution; supported by Redis Enterprise/Cloud, not OSS Redis
- CRDTs resolve conflicts via: LWW for Strings, sum-all-increments for counters, merge for Sets, field-level LWW for Hashes
- Active-active is eventually consistent — writes propagate with 100–300ms cross-region lag
- Do not use Redis active-active for strong consistency requirements (account balances, inventory) — use a transactional database
- OSS Redis alternatives: regional primaries + cross-region read replicas, sticky user routing, application-layer CRDT patterns
- The question "should I use multi-region Redis?" often has the answer "no" — regional cache with fallback is simpler and correct for most use cases
Next: A-13 — Disaster Recovery, Backup, and Point-in-Time Restore — RDB backup scheduling, AOF log shipping, recovery runbooks, and testing restore procedures before you need them.