Containers & Cloud Security

Containers are the unit of deployment for most cloud workloads - and the unit of compromise for many cloud breaches. Vendor-neutral guide to what containers actually are, why they matter for cloud, and the real risks: escapes, identity chaining via the metadata service, flat networking, and supply chain attacks.

From above contemporary server cable trays in a modern data center
Photo by Brett Sayles on Pexels

· · Vendor-neutral · View source on GitHub

The 30-second version: A container is a Linux process (or set of processes) isolated by namespaces and cgroups, packaged with its dependencies into an immutable image. Multiple containers share one host kernel - so the boundary is real but thin. In the cloud, containers run on a VM whose instance metadata service hands out the VM's IAM credentials. A compromised container that can reach IMDS gets the whole VM's cloud permissions.

That's the dominant cloud-container risk: identity chaining - pivoting from "code execution in one container" to "the cloud role attached to the VM." Add container escapes via privileged flags or kernel CVEs, flat pod-to-pod networking, and pulled-from-public-internet base images, and you have the full container-security surface in the cloud.

On this page

  1. What a container actually is
  2. Why containers and cloud are inseparable
  3. The boundary - is a container a security boundary?
  4. The six risk categories that matter
  5. Container escapes - the real paths
  6. Identity chaining - IMDS and stolen cloud roles
  7. Networking - flat by default
  8. Supply chain - images, registries, signing
  9. Minimal & hardened base images
  10. Runtime detection
  11. AWS, Azure, and GCP container services
  12. Hardening checklist
  13. Common pitfalls
  14. Further reading
  15. FAQ

What a container actually is

A container is a Linux process (or group of processes) with three things wrapped around it:

  1. Namespaces - kernel features that give the process its own view of system resources. PID namespace makes the process think it's the only one on the system. Mount namespace gives it its own filesystem root. Network namespace gives it its own interfaces. UTS, IPC, user, cgroup namespaces round it out. Different containers see different namespaces; the kernel keeps the views separate.
  2. Cgroups (control groups) - kernel mechanism for capping CPU, memory, I/O, and PID counts per group of processes. The container can't starve the host, can't fork-bomb the system, can't allocate more memory than its limit.
  3. An image - a tarball containing the filesystem the process sees, plus a manifest describing how to run it. Built once, shipped to a registry, pulled to any host, run identically.

That's the entire abstraction. There is no "container kernel" - the host's kernel runs every container's code. There is no hypervisor - containers are not VMs. They start in milliseconds because they're just processes; the only thing being created is the namespace and cgroup wiring around an exec.

Container runtimes (containerd, CRI-O, Docker Engine, runc underneath) automate the namespace setup and image management. The Open Container Initiative (OCI) standardizes the image format and runtime API so the same image runs on any conformant runtime.

Why containers and cloud are inseparable

The cloud's value proposition is "elastic capacity priced by the second." Containers are the most efficient unit to fill that capacity:

The cost of that efficiency, from a security standpoint, is a shared kernel and a shared network namespace on the host. Both become attack surface when the threat model is "untrusted code in a container" instead of "trusted code on a VM."

The boundary - is a container a security boundary?

It is not. Or more precisely: a container is a process-isolation boundary, not a tenant-isolation boundary. The difference matters.

If you run two of your own microservices in two containers on the same VM, the isolation is fine for most purposes - the kernel keeps their namespaces separate, cgroups keep one from starving the other. That's process isolation. It works.

If you run two different customers' code in two containers on the same VM, the isolation is not fine. A kernel CVE affects both. A misconfigured capability on one container can let it inspect or modify the other. A noisy-neighbor exploit (Spectre, Foreshadow) can leak data across the shared CPU cache. That's a tenant boundary, and the container abstraction was not designed for it.

The cloud providers know this. That's why AWS Fargate, Azure Container Apps, Cloud Run, and similar managed services run each workload in its own micro-VM (Firecracker or equivalent). The container API is preserved; the actual isolation is at the VM level, where the hypervisor is the boundary.

If you self-host on EKS / AKS / GKE / vanilla VMs, you control where containers land. If two workloads should not share a kernel - different trust levels, customer multi-tenancy, regulated data - use sandbox runtimes (gVisor, Kata Containers) or pin them to different node pools. The platform won't do this for you.

The six risk categories that matter

Container security in the cloud collapses into six recurring categories. The names differ across vendors; the threats don't.

1. Container escape

Process inside a container affects the host or other containers. Privileged flags, kernel CVEs, hostPath mounts, raw capabilities. The classic "get root on the node."

2. Identity chaining

Compromised container reaches the instance metadata service and steals the VM's IAM credentials - pivoting from one workload's permissions to the host's entire cloud role.

3. Lateral network movement

Default container networking is flat. Any pod can reach any other pod, and often the cloud control plane. One compromised container scans the internal network for soft targets.

4. Supply chain

Base images pulled from public registries, dependencies pulled from public package managers, build-time injection. Compromise the image; compromise every deploy.

5. Secrets exposure

Secrets baked into images, passed as env vars (visible in logs and on disk), or readable through the API server. The container becomes a secrets-extraction target.

6. Runtime visibility gap

Without runtime instrumentation, "the container is running" is all you know. Process trees, network connections, syscalls - none of it is captured by default.

Container escapes - the real paths

"Container escape" sounds exotic. In practice it's a small number of well-known paths, almost always opened by configuration choices the user made.

The privileged container

Running with --privileged (or securityContext.privileged: true in Kubernetes) disables almost every isolation feature the runtime adds. The container has all Linux capabilities, including CAP_SYS_ADMIN; sees all devices; can mount filesystems; can load kernel modules. From a privileged container, getting host root is a single chroot away. Trail of Bits has the canonical write-up.

Sensitive capabilities without --privileged

Even without the privileged flag, specific capabilities are dangerous: CAP_SYS_ADMIN, CAP_SYS_PTRACE, CAP_SYS_MODULE, CAP_NET_ADMIN, CAP_DAC_READ_SEARCH. Drop all capabilities by default and add back only what the workload needs.

HostPath / hostPID / hostNetwork / hostIPC

Mounting / (or /var/run/docker.sock) from the host into the container is direct escape. So is hostPID: true (you can signal host processes) and hostNetwork: true (you see and bind host interfaces, including the IMDS interface). Admission controllers should reject these by default in production.

Kernel CVEs

Because the host kernel is shared, a kernel vulnerability is a container-escape vulnerability. Notable examples: CVE-2022-0492 (cgroups v1 release_agent), CVE-2022-0185 (filesystem context heap overflow), Leaky Vessels (runc / Docker, 2024), Dirty Pipe. The defense is keeping host kernels patched - which on managed services is the cloud's job, on self-managed nodes is yours.

Misconfigured docker.sock

A container that can talk to /var/run/docker.sock can ask the daemon to start a new container - privileged, with the host root mounted. Used in numerous public breaches (Tesla 2018, various crypto-jacking incidents). Never mount the docker socket into anything other than CI/admin tooling, and even then, isolate.

Defense in layers

Identity chaining - IMDS and stolen cloud roles

This is the cloud-specific container risk. Every cloud VM has an Instance Metadata Service at the link-local address 169.254.169.254 (or fd00:ec2::254 on AWS IPv6, similar paths on Azure and GCP). The metadata service hands out, among other things, the IAM credentials of the role attached to the VM.

By default, containers running on the VM share its network namespace - they can reach 169.254.169.254. A compromised container with code execution can run curl http://169.254.169.254/... and walk away with the VM's full cloud permissions. That's identity chaining: workload compromise → host compromise → cloud compromise, in one hop.

This is exactly the Capital One breach pattern (see the Capital One kill chain) - except instead of an SSRF in a WAF, the entry point is whatever runs in your container.

The fix has two parts

1. Per-workload identity, not per-host. Every cloud now offers a way to give an individual workload (pod, container, function) its own short-lived cloud credentials, scoped to that workload only:

With per-workload identity, the container has only the permissions it needs. A compromise gets you that workload's credentials - not the VM's.

2. Block IMDS from containers entirely. Belt-and-suspenders: even with workload identity in place, the host's IMDS shouldn't be reachable from containers.

If your container framework is "the VM has a god role and every container uses it" - that's the legacy pattern. Move every workload to per-pod / per-task / per-function identity. Block IMDS from containers as a hard rule.

Networking - flat by default

Container networking starts permissive and gets restricted. By default in most environments:

That means a single compromised container is a scanner. It probes for accessible internal services, vulnerable management endpoints (Redis with no auth, Elasticsearch open to 0.0.0.0, a CI agent listening on the cluster network), and the cloud control plane (IMDS being the prime target).

What containment looks like

Flat network is fine for a single-team sandbox. It is not fine for production. Treat the assumption "this container only needs to talk to these specific places" as the design baseline.

Supply chain - images, registries, signing

The container image is the artifact that ships to every environment. Compromising it compromises every deploy. The supply chain risks layer up:

What good looks like

The pipeline that builds the image is part of the supply chain. See the CI/CD page for how to lock that down.

Minimal & hardened base images

A standard Ubuntu or Debian base ships with hundreds of packages your application doesn't use - a shell, a package manager, network tools, build toolchains. Every one of those packages is potential CVE surface and potential attacker tooling. The biggest single lever you have to reduce both is to start from a minimal base image.

The category has grown into a small market of its own. Each vendor takes a slightly different cut at the same problem: ship the smallest, most-current, most-trustworthy base image possible, so the application is the only thing left to harden.

Distroless

Google's open-source minimal images - language runtimes only, no shell, no package manager, no busybox. The original "remove everything you don't need" base. Free, well-maintained, but rebuild cadence is community-paced and CVE coverage is best-effort.

Chainguard Images

Commercial-grade minimal images built on the Wolfi "undistribution." Daily rebuilds, signed with Sigstore, SLSA L2 provenance, SBOM attached, often zero known CVEs at any moment. Free tier for older versions; paid for current and LTS.

Minimus

Newer entrant focused on minimal, continuously-rebuilt images plus a remediation workflow - they ship CVE-free replacements for your existing base images and track which of your workloads still ship vulnerable bases. Commercial; the value proposition is "one click to a clean fleet."

Wiz secured images

Wiz ships hardened base images bundled with their CNAPP - minimal, signed, scanned in the same plane as the runtime workloads they observe. Differentiator is the closed loop: the image, the registry scan, and the production-runtime detection are all in one product.

RapidFort

Profiles your container's actual syscall and file usage in CI, then strips out everything the workload never touches. Less of a "use my image" play, more of a "shrink whatever image you already have" approach.

Docker Hardened Images

Docker's own commercial minimal-image line, integrated with Docker Hub and Scout. Same general shape - minimal, signed, rebuilt - sold through the registry you already pull from.

Why this category exists at all

Three real problems push organizations off "FROM ubuntu":

How minimal images help the ecosystem, not just one image

The category's real impact is upstream of any one shop:

How to actually adopt one

None of this replaces the rest of the container hardening list - non-root user, dropped capabilities, read-only root filesystem, network policy, workload identity. It removes a whole category of base-image vulnerabilities before the rest of the hardening even applies.

Runtime detection

Scanning catches what's in the image. Network policy controls what can talk. Neither tells you when a process inside a running container does something unusual - a shell spawning from a web server, an unexpected outbound connection, a credential-file read, a privilege escalation attempt.

Runtime detection closes that gap. The dominant approach is eBPF-based syscall instrumentation:

What runtime detection catches: shell spawns inside production containers, processes writing to /etc/shadow, network connections to known C2 infrastructure, capability escalation, attempts to access /proc/1/root, anomalous outbound DNS. The signal is high; tuning the rules to fit your normal workload behavior is the work.

If you don't have runtime visibility, the answer to "did anything weird run in our containers last month?" is "we don't know."

Man analyzing business data and financial graphs on a laptop
Photo by Kaboompics on Pexels

AWS, Azure, and GCP container services

Every cloud ships an orchestrator (managed Kubernetes), a serverless container runtime, and a registry. The capabilities map closely:

Building block AWS Azure GCP
Managed Kubernetes EKS AKS GKE (Standard & Autopilot)
Serverless containers Fargate (with ECS or EKS), App Runner Container Apps, Container Instances Cloud Run
Simple container service ECS (Elastic Container Service) Container Apps Cloud Run / Cloud Run Jobs
Image registry ECR (Elastic Container Registry) Azure Container Registry Artifact Registry
Image scanning ECR scanning (basic + enhanced via Inspector) Defender for Containers (registry + runtime) Artifact Registry vulnerability scanning
Per-workload identity IRSA, EKS Pod Identity, ECS task roles AKS workload identity (federated to Entra) GKE workload identity, Cloud Run service accounts
Sandboxed runtime Fargate (Firecracker micro-VM) Container Apps (managed isolation), AKS Confidential Containers Cloud Run (gVisor), GKE Sandbox (gVisor)
Admission/signing enforcement AWS Signer + Kyverno on EKS Image integrity policies on AKS Binary Authorization on GKE & Cloud Run
Runtime threat detection GuardDuty for EKS & ECS Runtime Monitoring Defender for Containers Security Command Center Container Threat Detection

For a workload that doesn't need cluster primitives, the serverless options (Fargate, Container Apps, Cloud Run) are dramatically simpler to secure: no nodes to patch, no kernel to share, no --privileged footgun. For workloads that do need Kubernetes, see the Kubernetes page for the cluster-specific considerations.

Hardening checklist

The non-negotiable container hardening list for cloud workloads:

Image

Minimal/distroless base. Pinned to digest. No secrets baked in. Scanned at build, push, and on a schedule. Signed; signature verified at deploy.

Runtime config

Non-root user. Read-only root filesystem. All capabilities dropped. RuntimeDefault seccomp profile. No --privileged, no hostPath, no hostNetwork.

Identity

Per-workload cloud identity (IRSA / Workload Identity). IMDS unreachable from containers. No long-lived cloud credentials in env vars or images.

Network

Default-deny network policy. Explicit egress allow-list. mTLS for service-to-service. No direct public IPs on workload containers.

Layer on top: admission controllers that enforce these defaults (Pod Security Standards, Kyverno, OPA Gatekeeper); runtime detection (Falco, Tetragon, or your CNAPP's runtime module); centralized logs from every container; and an SBOM database for fast CVE response.

Common pitfalls

Further reading

Standards & baselines

Tooling

Vendor docs

Related CSOH pages

FAQ

Are Docker, containerd, and Kubernetes the same thing?

No. Docker is a developer-facing tool (CLI + daemon) that builds and runs containers. containerd is the lower-level runtime that actually starts processes; Docker uses it under the hood, and so does Kubernetes. Kubernetes is an orchestrator - it decides which containers run where, across many hosts. You can run containers without Kubernetes (ECS, Cloud Run, plain Docker); you can't run Kubernetes without a container runtime.

What's the difference between containers and VMs?

A VM has its own kernel; a container shares the host's. VMs boot in seconds with hundreds of MB of memory overhead; containers start in milliseconds with no kernel overhead. VMs are a strong isolation boundary (hypervisor); containers are a weaker one (kernel namespaces). The cloud's "best of both" answer is the micro-VM - Firecracker, Kata - which gives the container API on top of a fast hypervisor.

Is rootless Docker enough?

It helps a lot - the daemon no longer runs as root, and a container compromise is no longer "the daemon's root." But it's not a silver bullet: the kernel is still shared, kernel CVEs still escape, the workload still needs to drop capabilities and run as a non-root user inside the container. Rootless removes one big foot-gun; the rest of the hardening still applies.

How is "containerless" / serverless different?

Cloud Run, Container Apps, Fargate, App Runner - these run your container image, but the platform manages the host, the kernel, the runtime sandboxing, and the scaling. You don't have a node to log into. The trade-off is reduced control (no DaemonSets, no privileged sidecars) for significantly less security surface to manage. For most application workloads, this is the right default in 2026.

How does this relate to zero trust?

Containers operationalize zero trust at the workload layer when you do this right: per-workload identity (verify explicitly), default-deny networking (least privilege), mTLS between services (verify continuously), runtime detection (assume breach). The container is a unit of identity, not just a unit of deployment.

Where next