Module A-1·20 min read

SET key value NX PX as the atomic lock primitive, UUID lock values to prevent accidental release, lock extension with conditional PEXPIRE, the critical GC-pause failure mode, and why distributed locks need fencing tokens.

A-1 — Single-Instance Locking: SET NX PX and Lock Correctness

Who this module is for: You need to ensure only one process at a time executes a critical section — a payment deduction, a job processing step, a cache recompute. This module covers the correct Redis single-instance lock primitive, the mistakes that make naive implementations unsafe, and the fundamental limitation that requires fencing tokens for true correctness.


The Lock Primitive: SET NX PX

A Redis distributed lock uses a single key with three properties:

  1. Existence — the key exists means the lock is held
  2. Identity — the key's value identifies the lock holder (prevents accidental release)
  3. Expiry — the key has a TTL so it auto-releases if the holder crashes

The atomic primitive that satisfies all three in a single command:

SET lock:resource "lock-value" NX PX 30000
  • NX — only set if the key does Not eXist (acquire only if nobody holds the lock)
  • PX 30000 — expire in 30,000 milliseconds (auto-release if holder crashes)

Returns OK if the lock was acquired, nil if already held by another client.

Why this must be a single command: A non-atomic acquire would be:

# Wrong (race condition):
EXISTS lock:resource         → 0 (not held)
SET lock:resource "holder"   → sets the lock
EXPIRE lock:resource 30000   → sets expiry

Between EXISTS and SET, another client can acquire the lock. Between SET and EXPIRE, a crash leaves a permanent lock. The SET key value NX PX single command eliminates both races.


Lock Value: UUID for Identity

The lock value must uniquely identify the holder. Use a cryptographically random UUID:

typescript
import { randomUUID } from 'crypto'; const lockValue = randomUUID(); // e.g., "f47ac10b-58cc-4372-a567-0e02b2c3d479" await redis.set('lock:payment:1001', lockValue, 'NX', 'PX', 30000);

Why identity matters: Without a unique value, any client can release any lock:

# Wrong (no identity):
Client A: SET lock:payment ""  → acquires lock
# ... lock expires while A is slow ...
Client B: SET lock:payment ""  → acquires lock (A's lock expired)
Client A: DEL lock:payment     → releases B's lock! A doesn't know it lost the lock.

With a UUID, releasing the lock requires presenting the same UUID that was set:

typescript
// Safe release via Lua (atomic check-and-delete) const releaseScript = ` if redis.call("GET", KEYS[1]) == ARGV[1] then return redis.call("DEL", KEYS[1]) else return 0 end `; async function releaseLock(key: string, lockValue: string): Promise<boolean> { const result = await redis.eval(releaseScript, 1, key, lockValue); return result === 1; }

The Lua script atomically checks that the current lock value matches before deleting — if the lock expired and was acquired by another client, the check fails and we do not release their lock.


Full Lock Implementation

typescript
import { randomUUID } from 'crypto'; import Redis from 'ioredis'; const redis = new Redis(); const RELEASE_SCRIPT = ` if redis.call("GET", KEYS[1]) == ARGV[1] then return redis.call("DEL", KEYS[1]) else return 0 end `; interface Lock { release: () => Promise<boolean>; extend: (ttlMs: number) => Promise<boolean>; } async function acquireLock( resource: string, ttlMs: number, retries = 3, retryDelayMs = 100 ): Promise<Lock | null> { const key = `lock:${resource}`; const value = randomUUID(); for (let attempt = 0; attempt <= retries; attempt++) { const result = await redis.set(key, value, 'NX', 'PX', ttlMs); if (result === 'OK') { return { release: () => releaseLock(key, value), extend: (newTtlMs) => extendLock(key, value, newTtlMs), }; } if (attempt < retries) { await new Promise(r => setTimeout(r, retryDelayMs + Math.random() * retryDelayMs)); } } return null; // Could not acquire after all retries } async function releaseLock(key: string, value: string): Promise<boolean> { const result = await redis.eval(RELEASE_SCRIPT, 1, key, value); return result === 1; } async function extendLock(key: string, value: string, newTtlMs: number): Promise<boolean> { const extendScript = ` if redis.call("GET", KEYS[1]) == ARGV[1] then return redis.call("PEXPIRE", KEYS[1], ARGV[2]) else return 0 end `; const result = await redis.eval(extendScript, 1, key, value, String(newTtlMs)); return result === 1; } // Usage: async function processPayment(paymentId: string) { const lock = await acquireLock(`payment:${paymentId}`, 30000); if (!lock) { throw new Error('Could not acquire lock — payment already in progress'); } try { await executePaymentLogic(paymentId); } finally { await lock.release(); } }

Lock Extension (Heartbeat Pattern)

If your critical section takes longer than the lock TTL, the lock expires before the work is done — another client acquires it and you have two concurrent holders.

Solutions:

1. Set TTL generously — if work takes up to 5 seconds, set TTL to 30 seconds. Simple but wasteful (a crash holds the lock for up to 30 seconds).

2. Watchdog / heartbeat — a background timer extends the lock while the work is running:

typescript
async function acquireLockWithWatchdog(resource: string, ttlMs: number): Promise<Lock | null> { const key = `lock:${resource}`; const value = randomUUID(); const result = await redis.set(key, value, 'NX', 'PX', ttlMs); if (result !== 'OK') return null; // Watchdog: extend lock every ttlMs/3 const watchdogInterval = setInterval(async () => { const extended = await extendLock(key, value, ttlMs); if (!extended) { // Lock was lost (expired and taken by another client) clearInterval(watchdogInterval); // Signal the work loop to abort } }, ttlMs / 3); return { release: async () => { clearInterval(watchdogInterval); return releaseLock(key, value); }, extend: (newTtl) => extendLock(key, value, newTtl), }; }

The watchdog extends the lock to ttlMs every ttlMs/3 milliseconds — the lock never expires while the holder is alive and the watchdog is running.


The Fundamental Limitation: Process Pauses

Here is the scenario that breaks even a correctly implemented single-instance lock:

Time 0ms: Client A acquires lock (TTL = 30s)
Time 1ms: Client A begins critical section
Time 5000ms: Client A's JVM pauses for GC (stop-the-world collection)
Time 35000ms: Lock expires (30s have passed during the GC pause)
Time 35001ms: Client B acquires the lock
Time 35002ms: Client B begins critical section
Time 40000ms: Client A resumes from GC pause — A still thinks it holds the lock
             A and B are now BOTH executing the critical section

This is not a bug in the lock implementation — it is a fundamental property of distributed systems. Any client can pause for an arbitrary duration (GC, OS scheduling, VM migration, network partition). When it resumes, its lock may have expired and been acquired by another client.

The lock does not know the holder paused. The holder does not know the lock expired.

Fencing Tokens: The Correct Solution

A fencing token is a monotonically increasing integer issued by the lock service when a lock is acquired. The holder passes the fencing token to the resource being protected. The resource rejects any operation with a token lower than the highest it has seen.

Client A acquires lock → receives token 42
Client A pauses (GC) → lock expires
Client B acquires lock → receives token 43
Client B writes to database with token 43
Client A resumes → writes to database with token 42
Database: token 42 < max_seen (43) → REJECT Client A's write

Redis cannot natively provide fencing tokens — it has no global monotonically increasing counter that is tied to lock acquisitions. You need an external sequencer (ZooKeeper, etcd, or a database sequence).

Practical implication: For most application-level distributed locking (preventing double processing a job, preventing concurrent cache recomputes), process pauses shorter than the lock TTL are acceptable. The probability of a GC pause lasting 30+ seconds is low for most JVM/Node.js applications.

For operations where two concurrent executions would cause data corruption with no recovery path (bank transfers, inventory deduction, ledger entries), use database transactions with row-level locking — not Redis locks.


When Single-Instance Locks Are Correct

Use CaseSingle-Instance Lock Appropriate?
Cache stampede prevention (lock + recompute)Yes — double recompute is wasteful but not corrupting
Job queue deduplicationYes — double-processing a job is usually idempotent
Rate limiter coordinationBetter handled with INCR+EXPIRE
Inventory reservation (read-then-write)Risky — consider DB transaction with SELECT FOR UPDATE
Financial deductions (write once)No — use DB transaction with row lock + idempotency key
Distributed coordination across servicesUse etcd or ZooKeeper (have fencing token support)

Summary

  • The correct lock primitive: SET key value NX PX milliseconds — atomic, single command
  • Lock value must be a unique UUID — required for safe release (only the holder can release its lock)
  • Release via Lua script: check UUID matches before DEL — atomic check-and-delete
  • Lock extension via Lua: PEXPIRE only if UUID matches — prevents extending someone else's lock
  • Watchdog pattern: background timer extends the lock every TTL/3 to prevent expiry during long operations
  • Fundamental limitation: process pauses (GC, OS scheduling) can cause a client to hold a lock past its TTL without knowing — two clients can simultaneously believe they hold the lock
  • Fencing tokens are the correct solution — monotonically increasing integers rejected by the resource on out-of-order writes; Redis cannot natively provide them
  • Use Redis locks for best-effort coordination where double-execution is safe; use DB row locks for true exclusive access with no tolerance for concurrent execution

Next: A-2 — Lua Scripting: EVAL, EVALSHA, and Atomic Compound Operations — why Lua scripts execute atomically, the KEYS/ARGV convention, and operations that cannot be implemented safely without Lua.

© 2026 Jatin Jain Saraf (JJS). All rights reserved.