Module F-1·25 min read

Why containers are not VMs, Linux namespaces, cgroups, and the layered OverlayFS system.

Introduction

When most developers start learning Docker, they memorize a few commands: docker run, docker stop, docker ps. They treat Docker as a black box—a magic technology that somehow runs an application on any computer exactly the same way.

Many developers assume Docker is just a lightweight Virtual Machine (VM). This mental model works when you're just starting, but it completely breaks down in production. When you need to debug a container that's silently crashing due to an Out of Memory (OOM) error, or figure out why your container has no internet access, the "lightweight VM" model is useless.

In this module, we are going to look under the hood. We will unpack the "container illusion" and look at the actual Linux primitives that make Docker possible. By the end of this module, you will understand exactly what a container is (and isn't).


The Virtual Machine vs. Container Mental Model

Let's start by addressing the most common misconception.

The Virtual Machine Model

A Virtual Machine provides hardware-level virtualization. When you run a VM (like VirtualBox or VMware), a piece of software called a Hypervisor creates virtualized hardware—a fake CPU, fake memory, a fake hard drive, and a fake network card.

You then install a complete Guest Operating System (like Ubuntu or Windows) onto this fake hardware. The Guest OS boots up, loads its own kernel into memory, and then runs your application.

The Container Model

Containers do not virtualize hardware. They do not run a Guest Operating System. A container is just a normal Linux process.

When you run a Docker container, you are asking the Linux kernel on your host machine to run a standard process, but you are asking the kernel to lie to that process. The kernel puts up walls around the process so it thinks it is alone on the system.

Because a container is just a process, it has almost zero overhead. It starts in milliseconds, just like running node server.js locally.

Architecture Comparison

text
VIRTUAL MACHINE CONTAINER +-----------------------------+ +-----------------------------+ | Application 1 Application 2 | | Application 1 Application 2 | | +-----------+ +-----------+ | | +-----------+ +-----------+ | | | Guest OS | | Guest OS | | | | | | | | | +-----------+ +-----------+ | | |Container 1| |Container 2| | | Hypervisor | | +-----------+ +-----------+ | | +-------------------------+ | | Container Engine | | | Host OS Kernel | | | +-------------------------+ | | +-------------------------+ | | | Host OS Kernel | | | Hardware | | +-------------------------+ | +-----------------------------+ | Hardware | +-----------------------------+

This brings us to a fundamental rule of containers:

[!IMPORTANT] Containers share the host's Linux Kernel. If you are running Docker on an Ubuntu host, all your containers are executing system calls directly against that Ubuntu kernel. You cannot run a Windows container on a Linux kernel, because a Windows application needs a Windows kernel to understand its system calls.

[!NOTE] What about macOS and Windows? If containers must share the host's Linux kernel, how does Docker Desktop work on macOS and Windows? Under the hood, Docker Desktop silently provisions a lightweight Linux virtual machine on your Mac or PC. Your containers still run on a Linux kernel—they are just running inside that hidden VM.


The Three Pillars of the Container Illusion

If a container is just a normal Linux process, how does Docker trick it into thinking it's an isolated machine? It uses three native Linux features: Namespaces, Control Groups (cgroups), and the Union File System (UnionFS).

1. Namespaces: The Illusion of Isolation

Linux Namespaces restrict what a process can see. When Docker starts a container, it creates a dedicated set of namespaces for that process.

There are several types of namespaces:

  • PID Namespace (Process ID): When you run ps aux inside a container, you only see the processes running inside that container. Your Node.js app might think it is PID 1, but on the host machine, it might actually be PID 34921.
  • NET Namespace (Networking): The container gets its own isolated network stack, complete with its own IP address, routing tables, and firewall rules.
  • MNT Namespace (Mount): The container has its own isolated filesystem mount points. It cannot see the /home directory of your host machine unless you explicitly bind-mount it.
  • UTS Namespace (UNIX Timesharing System): This allows the container to have its own hostname.

When you run docker exec -it my-container sh, Docker simply launches a new sh process and injects it into the existing namespaces of my-container.

2. Control Groups (cgroups): The Illusion of Dedicated Resources

If Namespaces restrict what a process can see, cgroups restrict what a process can use.

Without cgroups, a poorly written memory-leaking Node.js application could consume 100% of the server's RAM, causing the entire host operating system to crash.

cgroups allow you to place hard limits on hardware resources:

  • Memory: "This container can only use 512MB of RAM. If it tries to use more, kill it (OOM Kill)." (When this happens, Docker reports that the container exited with code 137, one of the most common production issues developers encounter).
  • CPU: "This container is only allowed to use 50% of CPU core #1."
  • Block I/O: "Limit the disk write speed of this container to 10MB/s."

When you use the --memory="512m" flag in docker run, Docker is simply telling the Linux kernel to create a cgroup with a 512MB limit and place your container process inside it.

3. Union File Systems (OverlayFS): The Illusion of an OS

When you pull an image like node:22-alpine, it downloads in "layers." Why?

Containers use a Union File System (usually OverlayFS). This allows multiple distinct directories (layers) on the host machine to be stacked on top of one another to appear as a single, unified filesystem to the container.

  1. Base Image Layers: These are read-only. If you pull node:22-alpine, the base Alpine Linux files form the bottom layer, and the Node.js installation forms the next layer.
  2. Container Layer: When you start a container from an image, Docker adds a thin, writable layer on top of the read-only image layers.

If your application tries to modify a file that exists in a read-only layer, Docker uses a "Copy-on-Write" (CoW) strategy. It copies the file up into the writable container layer, and then modifies it there. The underlying read-only image layer remains unchanged.

[!TIP] This is why Docker images are so incredibly efficient. If you run 10 identical Node.js containers, they all share the exact same read-only image layers on your hard drive. Docker only creates 10 tiny writable layers for them.


What Docker Actually Adds

A beginner might reach this point and ask: "If Linux already provides namespaces, cgroups, and OverlayFS... what exactly does Docker do?"

Docker did not invent containers. It provides a user-friendly platform built on top of existing Linux kernel features.

Linux provides the raw materials. Docker provides:

  • Image Building: A standardized way to define environments (Dockerfile).
  • Image Distribution: A way to push and pull images across registries.
  • Lifecycle Management: Simple commands to start, stop, and inspect processes.
  • Networking Abstractions: Handling the complex iptables and bridge networks so you don't have to.
  • Developer Tooling: The CLI, Compose, and BuildKit.

Key Takeaways

  1. Containers are just Linux processes. They are not Virtual Machines and do not run a Guest OS.
  2. Docker manages Linux Primitives. It relies on the Linux kernel to do the actual heavy lifting.
  3. Namespaces limit what a container can see (PID, Network, Mounts).
  4. cgroups limit what a container can use (CPU, Memory, Disk I/O).
  5. UnionFS (OverlayFS) stacks read-only image layers and tops them with a writable container layer, enabling massive storage efficiency.

Knowledge Check

Question 1: You are running a Node.js container on a Linux host. Inside the container, your Node.js process is PID 1. What happens if you SSH into the Linux host machine and run ps aux | grep node?

  • A) You will not see the Node process, because it is isolated inside the container.
  • B) You will see the Node process, but its PID will be something like 45120, not 1.
  • C) You will see the Node process, and its PID will be 1 on the host as well.
  • D) You will see a virtual machine process (like qemu) instead of the Node process.
Reveal Answer

Correct Answer: B

Because a container is just a normal Linux process, it is fully visible to the host operating system. However, because of the PID Namespace, the kernel lies to the process inside the container, telling it that it is PID 1. On the host machine (which is outside the namespace), you see the true, underlying Process ID assigned by the host kernel.

Question 2: Which Linux primitive is responsible for preventing a container from consuming 100% of the host server's RAM?

  • A) Namespaces
  • B) OverlayFS
  • C) Control Groups (cgroups)
  • D) Hypervisors
Reveal Answer

Correct Answer: C

cgroups restrict resource usage (CPU, Memory, Block I/O). Namespaces restrict visibility, OverlayFS handles the filesystem, and Hypervisors are used for Virtual Machines, not containers.

Question 3: Which statement best describes OverlayFS?

  • A) It limits CPU usage.
  • B) It isolates networking.
  • C) It stacks image layers into a unified filesystem and adds a writable container layer.
  • D) It creates virtual hardware devices.
Reveal Answer

Correct Answer: C

OverlayFS is the Union File System that Docker uses to stack read-only image layers together efficiently, adding a final writable layer on top for the running container.

Discussion

0

Join the discussion

Loading comments...

© 2026 Jatin Jain Saraf (JJS). All rights reserved.