The math behind event loop lag, microtask queue starvation, and UV_THREADPOOL_SIZE tuning for cryptographic validation at scale.
Module 2 — Event Loop Saturation & Thread Pool Offloading
What this module covers: The event loop from Module 0 was presented as a single loop. That was a simplification. The event loop has six distinct phases, each with its own queue, processed in strict order. When you have thousands of incoming transactions per second, microtask queues competing with I/O callbacks, timers firing at sub-millisecond intervals, and cryptographic operations queuing for a thread pool that defaults to 4 threads — the phase structure determines everything about your latency profile. This module covers the event loop precisely, shows you how to measure saturation, and gives you the tools to offload work correctly.
The Six Phases of the Event Loop
The Node.js event loop is not a simple loop over a single queue. It is a phased loop — in each iteration, it processes up to six distinct phases, and the order is non-negotiable.
┌───────────────────────────┐
┌─>│ timers │ → setTimeout, setInterval callbacks
│ └─────────────┬─────────────┘
│ ┌─────────────▼─────────────┐
│ │ pending callbacks │ → I/O errors deferred to next iteration
│ └─────────────┬─────────────┘
│ ┌─────────────▼─────────────┐
│ │ idle, prepare │ → Internal use only
│ └─────────────┬─────────────┘
│ ┌─────────────▼─────────────┐
│ │ poll │ → Retrieve new I/O events, execute I/O callbacks
│ └─────────────┬─────────────┘
│ ┌─────────────▼─────────────┐
│ │ check │ → setImmediate callbacks
│ └─────────────┬─────────────┘
│ ┌─────────────▼─────────────┐
└──│ close callbacks │ → socket.on('close'), etc.
└───────────────────────────┘
Between every phase and between every callback within a phase, Node.js drains two special queues:
process.nextTickqueue — highest priority micro-queue- Promise microtask queue — second priority
These micro-queues drain completely before the next phase or callback runs. This ordering has critical implications for high-throughput ingestion.
Phase 1: Timers
Executes callbacks scheduled by setTimeout and setInterval whose thresholds have elapsed. The "threshold" is a minimum — the callback won't run before the specified time, but it may run later if other phases are busy.
Production implication: If your ingestion pipeline uses setTimeout(fn, 100) as a flush trigger for a batch write, that timer will not fire at exactly 100ms if the poll phase is busy processing incoming transaction data. At 50K events/sec, the poll phase can be continuously occupied, delaying timers by hundreds of milliseconds.
Phase 2: Pending Callbacks
Executes I/O callbacks that were deferred to the next loop iteration by the OS (typically some TCP errors). Rarely populated in normal operation.
Phase 3: Idle, Prepare
Internal to libuv. Not accessible from JavaScript.
Phase 4: Poll
The most important phase for I/O-intensive applications.
The poll phase does two things:
- Calculates how long to block waiting for new I/O events (0ms if there are pending timers or setImmediate callbacks, otherwise up to some calculated maximum)
- Processes I/O callbacks in the poll queue
For a blockchain indexer receiving a continuous stream of transactions: incoming socket data triggers epoll notifications → libuv adds callbacks to the poll queue → the poll phase drains the poll queue. As long as data keeps arriving, the poll phase stays busy.
The blocking calculation is critical: if the poll queue keeps filling faster than it drains, the event loop never moves past the poll phase. Timers don't fire. setImmediate callbacks don't run. This is event loop starvation.
Phase 5: Check
Executes setImmediate callbacks. This phase runs after the poll phase, not before. If you want code to run "soon" but after current I/O has been processed, setImmediate is correct. If you want code to run "immediately" (before any I/O callbacks), process.nextTick is correct.
javascript// Common pattern for blockchain indexers: // After writing to DB, immediately schedule validation (before timers, after I/O) db.write(tx).then(() => { setImmediate(() => validateAndIndex(tx)); });
Phase 6: Close Callbacks
Executes close event callbacks (socket.on('close', ...)). Cleanup only.
Microtask Queues: The Invisible Priority System
Before every phase transition and between every callback, Node.js drains microtask queues in priority order:
Priority 1: process.nextTick queue
Priority 2: Promise resolution queue (queueMicrotask, .then, await)
Both queues drain completely before the event loop moves forward. This has a dangerous implication: if you continuously add items to these queues, the event loop never advances.
The Microtask Starvation Pattern
javascript// DANGEROUS: infinite nextTick recursion function processTransactions(queue) { if (queue.length === 0) return; const tx = queue.shift(); handleTransaction(tx); process.nextTick(() => processTransactions(queue)); // ← recurses forever } // At 50K queued transactions: // - processTransactions runs, schedules nextTick // - nextTick queue drains: runs processTransactions again // - Schedules another nextTick // - nextTick queue drains again... 50,000 times // - The event loop NEVER moves past the current phase // - All I/O, timers, and setImmediate are starved
javascript// CORRECT: yield to the event loop periodically function processTransactions(queue) { const BATCH_SIZE = 100; let processed = 0; while (queue.length > 0 && processed < BATCH_SIZE) { const tx = queue.shift(); handleTransaction(tx); processed++; } if (queue.length > 0) { // Use setImmediate to yield to the event loop after each batch // This allows I/O and timers to run between batches setImmediate(() => processTransactions(queue)); } }
Promise Chain Depth and Starvation
javascript// This Promise chain processes 10,000 transactions without yielding async function processAll(transactions) { for (const tx of transactions) { await parseTransaction(tx); // adds to microtask queue await validateSignature(tx); // adds to microtask queue await writeToDatabase(tx); // adds to microtask queue } } // At 10,000 transactions with 3 awaits each: 30,000 microtasks // Before the event loop advances past the current phase, // all 30,000 resolve synchronously if they're CPU-bound // → I/O callbacks starved for the duration of this batch
Event Loop Utilization (ELU): The Critical Production Metric
Event Loop Utilization (ELU) measures the ratio of time the event loop spends actively executing JavaScript vs idling in the poll phase waiting for I/O.
ELU = active_time / (active_time + idle_time)
ELU = 0.0 → event loop idle, no work to do
ELU = 0.5 → event loop 50% busy, 50% waiting for I/O
ELU = 0.9 → event loop 90% busy — danger zone
ELU = 1.0 → event loop fully saturated, no capacity for new work
javascript// Measuring ELU in production import { eventLoopUtilization } from 'node:perf_hooks'; // Capture baseline at startup const startELU = eventLoopUtilization(); // Measure delta every 5 seconds setInterval(() => { const currentELU = eventLoopUtilization(startELU); console.log(`ELU: ${(currentELU.utilization * 100).toFixed(1)}%`); if (currentELU.utilization > 0.85) { console.warn('EVENT LOOP SATURATED — offload CPU work or scale horizontally'); } }, 5000);
javascript// Export to Prometheus for production alerting import { register, Gauge } from 'prom-client'; const eluGauge = new Gauge({ name: 'nodejs_event_loop_utilization', help: 'Event loop utilization (0–1)', }); let lastELU = eventLoopUtilization(); setInterval(() => { const elu = eventLoopUtilization(lastELU); eluGauge.set(elu.utilization); lastELU = eventLoopUtilization(); }, 1000);
Alert thresholds for a UPI payment gateway:
| ELU | Status | Action |
|---|---|---|
| < 0.70 | Healthy | No action |
| 0.70–0.85 | Watch | Profile for CPU hotspots |
| 0.85–0.95 | Warning | Offload CPU work, scale |
| > 0.95 | Critical | Immediate intervention, add instances |
Event Loop Lag: Measuring Actual Delay
ELU tells you how busy the loop is. Event loop lag tells you how delayed callbacks are relative to when they were scheduled.
javascript// Measure event loop lag directly function measureLag() { const start = process.hrtime.bigint(); setImmediate(() => { const lag = Number(process.hrtime.bigint() - start) / 1_000_000; // ms console.log(`Event loop lag: ${lag.toFixed(2)}ms`); }); } setInterval(measureLag, 1000);
A setImmediate callback should run in < 0.1ms under no load. Under saturation:
- At ELU 0.80: lag typically 2–10ms
- At ELU 0.90: lag typically 10–50ms
- At ELU 0.95: lag typically 50–200ms
For a blockchain indexer, 50ms event loop lag means WebSocket subscribers receive transaction confirmations 50ms late — and that 50ms compounds across every downstream consumer.
The libuv Thread Pool
libuv maintains a thread pool for operations that cannot be made truly non-blocking at the OS level. This pool is separate from the event loop thread and runs in parallel.
What Uses the Thread Pool
javascript// These operations use the libuv thread pool: import { readFile } from 'node:fs'; import { lookup } from 'node:dns'; import { pbkdf2, scrypt, randomBytes } from 'node:crypto'; import { createGzip } from 'node:zlib'; // Network sockets do NOT use the thread pool — they use epoll directly // TCP, UDP, pipes: always async, never thread pool
Critical for blockchain applications: crypto.pbkdf2, crypto.scrypt, and crypto.createSign all use the thread pool. Every transaction signature verification is a thread pool job.
The Default 4-Thread Limit
The thread pool defaults to 4 threads. If you submit more than 4 concurrent blocking operations, they queue. This queue has no limit.
javascript// Demonstrating thread pool exhaustion // Run with 4 simultaneous crypto operations — fills the pool const promises = Array.from({ length: 20 }, () => new Promise((resolve, reject) => { crypto.pbkdf2('password', 'salt', 100000, 64, 'sha512', (err, key) => { if (err) reject(err); else resolve(key); }); }) ); // Operations 5–20 will queue behind operations 1–4 // Total time ≈ ceil(20/4) × per-operation-time // NOT 20 × per-operation-time (parallel), NOT 1 × per-operation-time (4 parallel)
Sizing UV_THREADPOOL_SIZE for Cryptographic Workloads
The correct formula depends on your workload:
UV_THREADPOOL_SIZE = min(128, number_of_cpu_cores × 2)
For a blockchain indexer on an 8-core server performing signature verification on every transaction:
bash# Set before starting Node.js — cannot change at runtime UV_THREADPOOL_SIZE=16 node indexer.js # Or in your startup script: export UV_THREADPOOL_SIZE=16 node indexer.js
Why not just set it to 128? More threads = more memory (each thread has its own stack), more context switching, and eventually diminishing returns as CPU cores are shared. The formula above keeps thread count proportional to available parallelism.
javascript// Measuring thread pool saturation // Monitor the delay between submitting a thread pool job and its completion function measureThreadPoolLag() { const start = Date.now(); crypto.randomBytes(32, () => { // lightweight thread pool job const lag = Date.now() - start; if (lag > 10) { console.warn(`Thread pool lag: ${lag}ms — pool may be saturated`); } }); } setInterval(measureThreadPoolLag, 500);
worker_threads vs Thread Pool: Choosing the Right Tool
libuv's thread pool handles C-level blocking operations. worker_threads is for JavaScript-level CPU-bound work.
| libuv Thread Pool | worker_threads | |
|---|---|---|
| Language | C / native code | JavaScript |
| Control | Indirect (through APIs) | Direct (you write the worker) |
| Communication | Callback when done | Message passing / SharedArrayBuffer |
| Use case | crypto, fs, dns | Heavy JS computation |
| Default limit | 4 (configurable) | No system limit (you manage the pool) |
When to Use worker_threads
Use worker_threads when your hot path has CPU-bound work in JavaScript — complex data transformation, in-process schema validation at scale, or Merkle tree construction in JS.
javascript// main.js — worker thread pool for transaction validation import { Worker, isMainThread, parentPort, workerData } from 'node:worker_threads'; import { cpus } from 'node:os'; if (isMainThread) { // Pool of validation workers const POOL_SIZE = cpus().length; const workers = []; const queue = []; const pending = new Map(); let requestId = 0; // Spawn pool for (let i = 0; i < POOL_SIZE; i++) { const worker = new Worker(new URL(import.meta.url)); worker.on('message', ({ id, result, error }) => { const { resolve, reject } = pending.get(id); pending.delete(id); error ? reject(new Error(error)) : resolve(result); // Process next queued work if (queue.length > 0) { const { id: nextId, payload } = queue.shift(); worker.postMessage({ id: nextId, payload }); } }); workers.push({ worker, busy: false }); } // Submit work to pool export function validateTransaction(payload) { return new Promise((resolve, reject) => { const id = requestId++; pending.set(id, { resolve, reject }); const idleWorker = workers.find(w => !w.busy); if (idleWorker) { idleWorker.worker.postMessage({ id, payload }); } else { queue.push({ id, payload }); // queue if all workers busy } }); } } else { // Worker: CPU-bound validation logic parentPort.on('message', ({ id, payload }) => { try { const result = heavyValidation(payload); // runs in worker thread parentPort.postMessage({ id, result }); } catch (err) { parentPort.postMessage({ id, error: err.message }); } }); function heavyValidation(payload) { // Schema validation, signature verification in JS, Merkle proof check // This runs on its own thread — main thread event loop unaffected return true; } }
SharedArrayBuffer: Zero-Copy Communication
For high-frequency worker communication, message passing involves serialization (structuredClone or JSON). SharedArrayBuffer eliminates this:
javascript// Shared ring buffer between main thread and workers // Zero serialization — both threads read/write the same memory const BUFFER_SIZE = 1024 * 1024; // 1MB ring buffer const sharedBuffer = new SharedArrayBuffer(BUFFER_SIZE); const view = new Uint8Array(sharedBuffer); const control = new Int32Array(new SharedArrayBuffer(8)); // control[0] = write position, control[1] = read position // Main thread writes transaction data function writeTransaction(txBytes) { const writePos = Atomics.load(control, 0); view.set(txBytes, writePos); Atomics.store(control, 0, (writePos + txBytes.length) % BUFFER_SIZE); Atomics.notify(control, 0, 1); // wake one waiting worker } // Worker reads Atomics.wait(control, 0, Atomics.load(control, 0)); // sleep until data const readPos = Atomics.load(control, 1); // ... read and process
Preventing Event Loop Blocking During JSON Parsing
Large JSON payloads are a common blocking source. JSON.parse is synchronous and runs entirely on the main thread.
The Problem: Multi-MB Blockchain Payloads
A full Ethereum block can be 1–10MB of JSON. JSON.parse on a 5MB payload takes 20–80ms on modern hardware. On the event loop thread, this means 20–80ms where nothing else runs.
javascript// BAD: synchronous parse of large payload on main thread app.post('/ingest-block', (req, res) => { let body = ''; req.on('data', chunk => body += chunk); req.on('end', () => { const block = JSON.parse(body); // ← blocks event loop for 20–80ms if body is 5MB processBlock(block); res.sendStatus(200); }); });
Solution 1: Offload to a Worker Thread
javascriptimport { Worker } from 'node:worker_threads'; // parser-worker.js parentPort.on('message', ({ id, jsonString }) => { try { const parsed = JSON.parse(jsonString); // runs in worker, main thread free parentPort.postMessage({ id, result: parsed }); } catch (err) { parentPort.postMessage({ id, error: err.message }); } }); // main thread — JSON parse in worker async function parseBlockPayload(jsonString) { return new Promise((resolve, reject) => { const id = nextId++; parserWorker.postMessage({ id, jsonString }); parserWorker.once('message', ({ id: msgId, result, error }) => { if (msgId === id) error ? reject(error) : resolve(result); }); }); }
Solution 2: Streaming JSON Parser
For very large payloads, a streaming parser processes JSON incrementally, yielding to the event loop between chunks:
javascript// Using 'clarinet' for streaming JSON parsing import clarinet from 'clarinet'; function parseBlockStream(readable) { return new Promise((resolve, reject) => { const parser = clarinet.createStream(); const transactions = []; let inTransactionsArray = false; let depth = 0; parser.on('openarray', () => { depth++; if (depth === 2) inTransactionsArray = true; }); parser.on('closearray', () => { depth--; if (depth < 2) inTransactionsArray = false; }); parser.on('value', (value) => { if (inTransactionsArray) { transactions.push(value); } }); parser.on('end', () => resolve(transactions)); parser.on('error', reject); readable.pipe(parser); // non-blocking: parser processes chunks as they arrive }); }
The Phase Interaction: setImmediate vs process.nextTick vs Promise
Understanding the exact execution order is critical for scheduling work correctly in a high-throughput pipeline.
javascript// Execution order demonstration console.log('1: synchronous'); process.nextTick(() => console.log('2: nextTick')); Promise.resolve().then(() => console.log('3: Promise')); setImmediate(() => console.log('4: setImmediate')); setTimeout(() => console.log('5: setTimeout 0'), 0); console.log('6: synchronous'); // Output: // 1: synchronous // 6: synchronous // 2: nextTick ← nextTick queue (before Promise microtasks) // 3: Promise ← Promise microtask queue // 4: setImmediate ← check phase // 5: setTimeout 0 ← timers phase (next iteration)
For a blockchain indexer, this means:
javascript// Transaction arrives → parse → write to DB → notify subscribers // BAD: using setTimeout(fn, 0) for subscriber notification // This defers to the NEXT iteration of the event loop // → adds minimum one full event loop iteration of latency // CORRECT: using setImmediate // Runs in the check phase of the CURRENT iteration // → subscriber notified in the same loop iteration as the DB write completes db.write(transaction).then(() => { setImmediate(() => notifySubscribers(transaction)); // same loop iteration });
The Production Incident: Thread Pool Exhaustion During UPI Festival Spike
Context: A UPI payment gateway processing ~1,500 transactions/second normally. During a major festival sale, traffic spikes to 12,000 transactions/second.
The architecture:
javascript// Each incoming payment required signature verification app.post('/payment', async (req, res) => { const payment = req.body; // Verify HMAC signature — uses crypto module → thread pool const isValid = await verifyHmac(payment.signature, payment.data, secretKey); if (!isValid) return res.status(401).json({ error: 'Invalid signature' }); await db.write(payment); res.json({ status: 'accepted' }); }); function verifyHmac(signature, data, key) { return new Promise((resolve, reject) => { crypto.pbkdf2(key, data, 1, 64, 'sha512', (err, derivedKey) => { if (err) return reject(err); resolve(derivedKey.equals(Buffer.from(signature, 'hex'))); }); }); }
What happened at 12,000 req/sec:
Default UV_THREADPOOL_SIZE = 4. Each crypto.pbkdf2 call takes ~8ms. At 12,000 req/sec, the thread pool needed to complete 12,000 operations/sec. Maximum throughput: 4 threads × (1000ms / 8ms) = 500 operations/sec.
The queue of pending thread pool operations grew to over 20,000 within 2 seconds. Each incoming payment request waited in the thread pool queue. The event loop remained responsive (ELU: 0.12 — barely busy), but every request timed out because verifyHmac took 30–120 seconds to resolve — not because of CPU, but because of thread pool starvation.
The fix:
javascript// Fix 1: Increase thread pool size // UV_THREADPOOL_SIZE=32 node payment-gateway.js // Fix 2: Replace pbkdf2 (slow, CPU-intensive) with createHmac (fast, no thread pool) function verifyHmac(signature, data, key) { const hmac = crypto.createHmac('sha256', key); hmac.update(data); const computed = hmac.digest('hex'); return crypto.timingSafeEqual( Buffer.from(computed), Buffer.from(signature) ); // crypto.createHmac runs synchronously but is fast (~0.1ms) // Does NOT use thread pool — runs directly on the main thread // At 0.1ms per verification, main thread handles 10,000/sec before saturation } // Fix 3: For slow operations (genuine pbkdf2 for key derivation), // use worker_threads with a pool sized to CPU count
After the fix: At 12,000 req/sec, createHmac verifications ran at 0.1ms each on the main thread. ELU increased to 0.42 (acceptable). No thread pool queue buildup. Response time: 4–12ms end-to-end.
Summary
| Concept | Key Takeaway |
|---|---|
| Six event loop phases | Timers → Pending → Poll → Check → Close. Strict execution order. |
| Poll phase | Where I/O callbacks run. Continuous data = loop stays in poll phase. |
| Microtask queues | process.nextTick then Promises, before every phase. Infinite recursion starves the loop. |
| ELU | Ratio of active to idle event loop time. Alert at > 0.85. |
| Event loop lag | Actual delay between scheduling and execution. Measurable with setImmediate probe. |
| Thread pool | Default 4 threads for crypto, fs, dns. Size with UV_THREADPOOL_SIZE = cores × 2. |
| Thread pool exhaustion | Requests don't fail — they just wait forever. ELU stays low while requests queue. |
worker_threads | JavaScript CPU work off the main thread. Use for heavy JS computation, not I/O. |
SharedArrayBuffer | Zero-copy communication between threads. Use for high-frequency data transfer. |
| Streaming JSON | Parse multi-MB payloads with clarinet or in a worker thread. Never JSON.parse large payloads on the main thread. |
setImmediate vs nextTick | setImmediate yields to the event loop (check phase). nextTick is immediate (before next phase). |
The event loop determines when your code runs. The next layer down determines what happens when data actually arrives at the kernel. Module 3 goes into the OS-level I/O multiplexing that the event loop is built on — epoll, kqueue, and the precise journey a transaction takes from the NIC to your JavaScript callback.
Next: Module 3 — Kernel-Level I/O Multiplexing: epoll, kqueue, IOCP →