Module F-2: Building Production-Grade Images — Docker In-Depth

Q: Why is it considered an anti-pattern to run `COPY . .` before `RUN npm install` in a Node.js Dockerfile?

Changing any source code file will invalidate the cache, forcing `npm install` to run again on every build. — Because Docker caches layers sequentially, copying the entire repository first means any small code change invalidates that layer, forcing the subsequent `npm install` layer to run from scratch.

Q: You are building a Docker image on an Apple Silicon (M2, `arm64`) Mac. You push this image to a registry and your CI/CD pipeline deploys it to an AWS EC2 instance running a standard Intel `x86_64` processor. The container instantly crashes with an `exec format error`. What is the correct architectural solution to this problem?

Use `docker buildx build --platform linux/amd64,linux/arm64` to compile the image for both target architectures and push a Manifest List. — Docker containers share the host kernel. Binaries compiled for an ARM processor (`arm64`) cannot execute on an Intel/AMD processor (`x86_64` or `amd64`). By using `docker buildx` with the `--platform` flag, Docker will cross-compile the image for all specified architectures and push a "Manifest List" (a fat manifest) to the registry. When the EC2 instance pulls the image, the registry detects the instance's architecture and automatically delivers the correct `amd64` binary layer.

Q: You have implemented a multi-stage build for a large Next.js application to reduce the final image size. However, you notice that in Stage 1, `npm ci` still takes several minutes to download packages from the npm registry every time you add a single new dependency, even though `package.json` is copied properly before the source code. How can you significantly speed up this specific scenario using modern Docker BuildKit features?

Use the `RUN --mount=type=cache,target=/root/.npm npm ci` syntax to persist the npm cache directory across separate Docker builds. — Normally, the Docker build environment is entirely stateless. If `package.json` changes, the `npm ci` layer's cache is invalidated, and it must download the entire internet again. BuildKit introduces Cache Mounts (`--mount=type=cache`), which allow you to specify a directory (like `/root/.npm`) that Docker should securely cache and share *between* independent build runs. When `package.json` changes, `npm ci` will re-run, but it will utilize the preserved cache, downloading only the newly added dependency.

Module F-2·35 min read

Writing optimal Dockerfiles, layer caching, BuildKit mounts, multi-stage builds, and multi-architecture image registries.

JJS

Written by Jatin Jain Saraf · Senior Software Engineer

Introduction

A container is only as good as the image it runs from. In the JavaScript ecosystem, it is painfully common to see Next.js or Express Docker images that exceed 1GB in size, take 10 minutes to build, and constantly invalidate their caches.

In this module, we will explore how to architect production-grade Docker images. We will move beyond basic Dockerfile syntax and dive into Layer Caching, BuildKit optimizations, Multi-Stage Builds, and Multi-Architecture support.

The Layer Caching Mechanism

To write an optimized Dockerfile, you must first understand how Docker caches layers.

Every instruction in a Dockerfile (FROM, RUN, COPY, etc.) creates a new layer. Docker builds an image layer by layer, top to bottom. If an instruction and its inputs haven't changed since the last build, Docker skips executing it and reuses the cached layer.

However, there is a golden rule of caching: If a layer's cache is invalidated, all subsequent layers below it are also invalidated.

The Node.js Caching Anti-Pattern

Here is the most common mistake in Node.js Dockerfiles:

dockerfile

# ❌ THE ANTI-PATTERN
FROM node:22-alpine
WORKDIR /app
COPY . .          # Invalidate cache if ANY file in the repo changes!
RUN npm install   # Forced to re-run every single time!
CMD ["npm", "start"]

Because COPY . . copies your entire codebase, editing a single markdown file or CSS file will invalidate the cache for that layer. Because that layer is invalidated, the next layer (RUN npm install) is also invalidated.

You end up waiting 3 minutes for npm install on every single build, even if your package.json hasn't changed.

The Correct Caching Strategy

You must separate your dependency installation from your source code:

dockerfile

# ✅ THE CORRECT PATTERN
FROM node:22-alpine
WORKDIR /app

# 1. Copy ONLY dependency manifests first
COPY package.json package-lock.json ./

# 2. Install dependencies (Cached unless package.json changes)
RUN npm ci

# 3. Copy the rest of the application code
COPY . .

CMD ["npm", "start"]

By copying package.json before the rest of the code, the RUN npm ci layer is cached. It will only re-execute if you actually install a new package.

Docker BuildKit: Next-Generation Building

Modern Docker uses BuildKit, a highly optimized backend builder. BuildKit enables features like parallel stage execution, secret management, and advanced cache mounts.

To explicitly use BuildKit features, you must add a special syntax directive at the very top of your Dockerfile:

dockerfile

# syntax=docker/dockerfile:1

Cache Mounts

Even with the correct package.json strategy above, what happens when you do add a new dependency? npm ci has to download every single package from the internet again, because the Docker build environment doesn't have access to your host machine's ~/.npm cache.

BuildKit solves this with Cache Mounts.

dockerfile

# syntax=docker/dockerfile:1
FROM node:22-alpine
WORKDIR /app
COPY package*.json ./

# Mount the npm cache directory during the build
RUN --mount=type=cache,target=/root/.npm \
    npm ci

COPY . .
CMD ["npm", "start"]

The --mount=type=cache flag persists the /root/.npm directory between Docker builds. If you add a single new dependency, npm will pull it, but it will use the cache for the other 500 packages. Build times drop from minutes to seconds.

Multi-Stage Builds: Shrinking Images

A Next.js application requires massive development dependencies (Webpack, Babel, SWC, TypeScript, etc.) to compile. But once it's compiled, the production server only needs Node.js and the final build output.

If you ship the development dependencies to production, your image will be 1GB+.

Multi-stage builds solve this by using multiple FROM statements in a single Dockerfile. You build the app in a bloated "builder" stage, and then copy only the compiled artifacts into a tiny "runner" stage.

Example: Multi-Stage Next.js Build

dockerfile

# syntax=docker/dockerfile:1

# ─── STAGE 1: Dependencies ──────────────────────────────────────
FROM node:22-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm npm ci

# ─── STAGE 2: Builder ───────────────────────────────────────────
FROM node:22-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# ─── STAGE 3: Production Runner ─────────────────────────────────
FROM node:22-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production

# Copy ONLY the built artifacts and production dependencies
COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static ./.next/static

# (Running as non-root is critical, covered in Module 7)
USER node
CMD ["node", "server.js"]

This pattern regularly shrinks 1.5GB Node.js images down to 60MB.

Multi-Architecture Builds

Today, developers write code on Apple Silicon (M1/M2/M3 - arm64) but deploy to AWS/GCP Linux servers (amd64).

If you build a Docker image on an M3 MacBook using standard commands, you are building a linux/arm64 image. If you deploy that exact image to an AWS EC2 instance running x86_64 processors, it will instantly crash with an exec format error.

You must compile for the target architecture.

Using BuildKit's docker buildx, you can build for multiple architectures simultaneously:

bash

# Create a builder instance (only needed once)
docker buildx create --use

# Build for both architectures and push to a registry
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  -t myorg/myapi:latest \
  --push .

Docker will compile the image for both architectures and push a "Manifest List" to the registry. When an EC2 instance pulls myorg/myapi:latest, the registry automatically serves the amd64 version. When another developer on an M3 Mac pulls it, they receive the arm64 version.

Registries and Image Distribution

A production engineer must understand how images are distributed and versioned.

Registries: Docker Hub is the default, but enterprise teams use Private Registries like GitHub Container Registry (GHCR), AWS ECR, or Google Artifact Registry.
Tagging Strategy: latest is not a version. If you deploy myapp:latest, you cannot easily rollback because latest constantly points to different things.
Semantic Versioning: You should tag images with exact versions (e.g., myapp:1.4.2).
Image Promotion: Do not rebuild images for staging and production. Build the image once in CI, test it in staging, and if it passes, promote that exact same immutable image artifact to production.

Key Takeaways

Optimize Layer Caching: Copy package.json and install dependencies before copying source code.
Use BuildKit: Enable cache mounts (--mount=type=cache) to dramatically speed up npm ci on subsequent builds.
Multi-Stage Builds: Strip out compilation tools and dev dependencies to shrink images from GBs to MBs.
Multi-Arch Builds: Use docker buildx to ensure your M-series Mac builds work on x86 cloud servers.
Tagging: Never deploy latest to production. Use immutable semantic versions.

Knowledge Check

Why is it considered an anti-pattern to run COPY . . before RUN npm install in a Node.js Dockerfile?

You are building a Docker image on an Apple Silicon (M2, arm64) Mac. You push this image to a registry and your CI/CD pipeline deploys it to an AWS EC2 instance running a standard Intel x86_64 processor. The container instantly crashes with an exec format error. What is the correct architectural solution to this problem?

You have implemented a multi-stage build for a large Next.js application to reduce the final image size. However, you notice that in Stage 1, npm ci still takes several minutes to download packages from the npm registry every time you add a single new dependency, even though package.json is copied properly before the source code. How can you significantly speed up this specific scenario using modern Docker BuildKit features?

Test your knowledge with more question sets

Sign in to keep reading

The rest of this module is free — sign in with Google to unlock it and track your progress.

PreviousModule F-1: Unpacking the Container Illusion Next Module P-1: Container Runtime Fundamentals

Discussion

Join the discussion

Loading comments...