Module A-6·24 min read

The real-world problem: 10+ multithreaded Node.js instances processing Kafka-delivered blockchain blocks at 2,000+ TPS without double-processing. Block-range partitioning via Redis locks, heartbeat extension, crash recovery, and the 6-hour replication lag incident.

A-6 — The SupraScan Architecture: Coordinating 10+ Concurrent Scanner Instances

This module is different. The previous modules covered Redis primitives in theory and controlled examples. This module is a case study from production: building a distributed blockchain indexer that processes Kafka-delivered blocks at 2,000+ transactions per second across 10+ concurrent Node.js instances. Every technique from this course is here — distributed locks, Redis coordination, crash recovery, memory management under load, and the failure modes that only appear at production scale.


The Problem

SupraScan is a blockchain indexer for the Supra L1 network. Its job: consume every block from Kafka, parse every transaction, and write structured data to PostgreSQL — with no missed blocks, no duplicate processing, and no gaps in the indexed history.

The constraints:

  • Throughput: 2,000+ transactions per second at peak
  • Concurrency: 10+ worker instances, each multi-threaded (Node.js cluster + worker threads)
  • Ordering: Blocks must be indexed in strict sequence (block N before block N+1)
  • Durability: The indexed data is the only source of historical state — unavailable via on-chain RPC. No data loss is acceptable.
  • Availability: Processing must resume automatically after any worker crash
  • Historical data: The indexer's complete historical dataset became foundational infrastructure consumed by multiple cross-functional teams across analytics, product, and research — this elevated the cost of any data loss or gap

The fundamental challenge: distributed coordination without a central coordinator. Every worker is equal. Any worker can crash at any time. No worker has special authority.


The Architecture

Kafka (block stream)
        │
        │ Multiple consumer groups
        ▼
┌──────────────┐  ┌──────────────┐  ┌──────────────┐
│  Worker 1    │  │  Worker 2    │  │  Worker N    │
│  (Node.js)   │  │  (Node.js)   │  │  (Node.js)   │
│  Cluster     │  │  Cluster     │  │  Cluster     │
└──────┬───────┘  └──────┬───────┘  └──────┬───────┘
       │                 │                 │
       └────────────────┬┘                 │
                        │                 │
                   ┌────▼──────────────────▼───┐
                   │         Redis              │
                   │  (coordination layer)      │
                   └────────────────────────────┘
                                │
                           ┌────▼────┐
                           │ PostgreSQL│
                           └─────────┘

Redis is the coordination layer — it holds lock state, progress tracking, and health signals. PostgreSQL is the data store. Kafka is the event source.


Block-Range Partitioning via Redis Locks

Each Kafka partition delivers blocks in order. Multiple workers consume from multiple partitions simultaneously. The challenge: if two workers consume the same block (e.g., due to consumer group rebalancing), they must not both write it to the database.

Solution: Block-range locks.

Before processing a block range (e.g., blocks 100,000–100,099), a worker acquires a Redis lock for that range:

SET lock:block:range:100000 {worker_uuid} NX PX 60000

If the lock is acquired, the worker owns that range exclusively. If another worker also tries to process the same range (due to consumer group rebalancing, duplicate delivery), it sees the lock and skips.

typescript
async function processBlockRange(startBlock: number, endBlock: number, blocks: Block[]) { const rangeKey = `lock:block:range:${startBlock}`; const workerId = `${os.hostname()}:${process.pid}:${Date.now()}`; const acquired = await redis.set(rangeKey, workerId, 'NX', 'PX', 60000); if (acquired !== 'OK') { logger.info(`Block range ${startBlock}-${endBlock} already locked, skipping`); return; } // Start heartbeat to extend lock while processing const heartbeat = setInterval(async () => { await redis.pexpire(rangeKey, 60000); // extend by 60 more seconds }, 20000); // renew every 20s (1/3 of TTL) try { for (const block of blocks) { await indexBlock(block); } // Mark range as complete await redis.set(`complete:block:range:${startBlock}`, '1', 'EX', 86400); } finally { clearInterval(heartbeat); // Release lock only if we still hold it const current = await redis.get(rangeKey); if (current === workerId) { await redis.del(rangeKey); } } }

Idempotency Check

Before acquiring the lock, check if this range was already completed (by a previous run or another worker that finished and released the lock):

typescript
const alreadyDone = await redis.exists(`complete:block:range:${startBlock}`); if (alreadyDone) { logger.debug(`Block range ${startBlock} already indexed, skipping`); return; }

The complete:* keys have a 24-hour TTL — sufficient to prevent reprocessing during normal operation while not accumulating indefinitely.


Progress Tracking: Sorted Set as a Processing Frontier

The indexer must maintain a global "frontier" — the highest contiguous block number that has been fully indexed. Blocks may be processed out of order (due to parallel consumers), so we need to track which blocks are complete and advance the frontier only when gaps are filled.

typescript
const COMPLETED_BLOCKS_KEY = 'indexer:completed:blocks'; const FRONTIER_KEY = 'indexer:frontier'; // After successfully indexing a block: async function markBlockComplete(blockNumber: number) { // Add to the sorted set (score = blockNumber) await redis.zadd(COMPLETED_BLOCKS_KEY, blockNumber, String(blockNumber)); // Try to advance the frontier await advanceFrontier(); } async function advanceFrontier() { const frontier = parseInt(await redis.get(FRONTIER_KEY) ?? '0', 10); // Get all completed blocks from frontier+1 onwards const nextBlocks = await redis.zrangebyscore( COMPLETED_BLOCKS_KEY, frontier + 1, frontier + 10000, // look ahead 10,000 blocks 'LIMIT', 0, 10000 ); let newFrontier = frontier; for (const blockStr of nextBlocks) { const block = parseInt(blockStr, 10); if (block === newFrontier + 1) { newFrontier = block; } else { break; // gap in sequence — stop advancing } } if (newFrontier > frontier) { await redis.set(FRONTIER_KEY, String(newFrontier)); // Clean up completed blocks below the new frontier (memory management) await redis.zremrangebyscore(COMPLETED_BLOCKS_KEY, 0, newFrontier); logger.info(`Frontier advanced to block ${newFrontier}`); } }

The Sorted Set holds in-flight and recently completed block numbers. The frontier advances when a contiguous run of completed blocks is detected. Completed blocks below the frontier are trimmed to prevent unbounded memory growth.


Crash Recovery: Reclaiming Orphaned Locks

When a worker crashes mid-processing, its lock expires (due to TTL). The block range is not marked complete. Another worker must pick it up.

The recovery mechanism: a background "reclaimer" process periodically scans for block ranges that have locks but are not in the completed set and not currently being processed by a healthy worker.

typescript
async function reclaimOrphanedRanges() { // Scan all block range locks const lockKeys: string[] = []; let cursor = '0'; do { const [nextCursor, keys] = await redis.scan(cursor, 'MATCH', 'lock:block:range:*', 'COUNT', 100); lockKeys.push(...keys); cursor = nextCursor; } while (cursor !== '0'); for (const lockKey of lockKeys) { const lockValue = await redis.get(lockKey); if (!lockValue) continue; // already expired const ttl = await redis.pttl(lockKey); if (ttl > 30000) continue; // lock is healthy (more than 30s remaining) // Lock is close to expiry — is the holder still alive? const [, workerId] = lockKey.split('lock:block:range:'); const heartbeatKey = `heartbeat:${lockValue}`; const heartbeatExists = await redis.exists(heartbeatKey); if (!heartbeatExists) { // Worker is dead — reclaim this range for reprocessing logger.warn(`Reclaiming orphaned lock ${lockKey} (worker ${lockValue} is down)`); await redis.del(lockKey); // The range will be re-acquired by the next worker that processes it } } }

Worker Heartbeats

Each worker publishes a heartbeat key with a short TTL. If the worker crashes, the heartbeat expires and the reclaimer can detect orphaned ranges:

typescript
// Each worker: publish heartbeat every 10 seconds const HEARTBEAT_TTL = 30; // 30 seconds const heartbeatKey = `heartbeat:${workerId}`; setInterval(async () => { await redis.set(heartbeatKey, Date.now().toString(), 'EX', HEARTBEAT_TTL); }, 10000); // On clean shutdown: remove heartbeat immediately process.on('SIGTERM', async () => { await redis.del(heartbeatKey); process.exit(0); });

The 6-Hour Replication Lag Incident

The most instructive production failure: a Redis replica fell 6 hours behind the primary due to network partition. After the partition healed and the replica caught up, approximately 6 hours of completed-block markers were replayed — causing the reclaimer to see blocks as "not completed" and requeue them for reprocessing.

Root cause: The reclaimer queried a replica (for read scaling), which had stale data. It saw block 500,000 as incomplete (because the completion marker had not yet replicated), even though it was fully indexed in PostgreSQL.

Consequences: Approximately 1M blocks were re-queued for reprocessing. The workers detected duplicate inserts via PostgreSQL's ON CONFLICT DO NOTHING and discarded them, but the extra processing load caused a 2-hour throughput degradation.

Fix:

  1. Read coordination state from the primary only. Completion markers, frontier state, and lock state are read from the primary Redis connection. Replica reads are only used for non-critical data (metrics, display data).

  2. Idempotent writes to PostgreSQL. All INSERT statements use ON CONFLICT DO NOTHING or ON CONFLICT DO UPDATE. Double processing produces the same result as single processing.

  3. Replication lag monitoring. Alert when replica.lag > 5s. A 6-hour lag should have triggered alerts long before it caused data consistency issues.

typescript
// Separate client for coordination reads (always primary) const primaryRedis = new Redis({ host: 'redis-primary.internal' }); const replicaRedis = new Redis({ host: 'redis-replica.internal' }); // All coordination reads from primary const frontier = await primaryRedis.get(FRONTIER_KEY); const lockExists = await primaryRedis.exists(rangeKey); // Non-critical display data can use replica const displayMetrics = await replicaRedis.hgetall('indexer:metrics');

Memory Management Under Load

At 2,000+ TPS with 10 workers, Redis memory pressure was a real concern. Key management practices:

Bounded Sorted Set for completed blocks:

typescript
// Keep only the last 100,000 completed block numbers in the sorted set // (frontier advancement removes everything below the frontier anyway) await redis.zremrangebyrank(COMPLETED_BLOCKS_KEY, 0, -100001);

TTL on all coordination keys:

Every key written by the indexer has a TTL — lock keys (60s), heartbeat keys (30s), completion markers (24h). No orphan keys accumulate indefinitely.

Monitoring key count:

typescript
setInterval(async () => { const info = await primaryRedis.info('keyspace'); const dbMatch = info.match(/db0:keys=(\d+)/); if (dbMatch) { metrics.gauge('redis.key_count', parseInt(dbMatch[1])); } }, 60000);

Lessons for Your Own Distributed Systems

  1. Read coordination state from the primary. Replica lag causes phantom failures and false recoveries. The performance cost is acceptable for coordination operations.

  2. Make all writes idempotent. Double-processing must produce the same result as single-processing. Design your database writes with ON CONFLICT from the start, not as an afterthought.

  3. TTL everything. Every Redis key written by background processes must have a TTL. An exception is a memory leak waiting to manifest under production load.

  4. Heartbeats are not optional. Without heartbeats, you cannot distinguish a slow worker from a crashed one. Without that distinction, your reclaimer either ignores crashed workers (gap in processing) or aggressively reclaims slow workers (double processing under load).

  5. Monitor replication lag as a first-class metric. A large lag is a data consistency crisis in slow motion. Alert at seconds, not minutes.

  6. The sorted set frontier pattern scales. Using a Sorted Set to track out-of-order completions and advance a contiguous frontier is a general pattern — applicable to any system processing an ordered stream with parallel workers.


Summary

  • SupraScan coordinated 10+ concurrent worker instances using Redis block-range locks with heartbeat extension and UUID-based holder identity
  • The completed blocks Sorted Set + frontier key pattern tracked out-of-order processing and maintained strict sequential guarantees
  • Crash recovery via lock expiry + heartbeat absence detection — a worker with a missing heartbeat is presumed dead; its ranges are reclaimed
  • The 6-hour replication lag incident taught the fundamental lesson: always read coordination state from the primary; replica reads are for display data only
  • Idempotent database writes are the safety net — double processing should produce the same result as single processing
  • TTL every coordination key, monitor key counts, alert on replication lag above 5 seconds

Next: A-7 — Master-Replica Replication: PSYNC, Replication Buffer, and Lag — how Redis propagates writes to replicas, what happens when a replica falls behind, and how to measure and manage replication lag.

© 2026 Jatin Jain Saraf (JJS). All rights reserved.