What is the difference between Kubernetes and managed Kubernetes?

Kubernetes is the open-source orchestrator. 'Managed Kubernetes' is a cloud-hosted version (EKS on AWS, AKS on Azure, GKE on Google) where the cloud runs the control plane - etcd, API server, scheduler, controller manager - and you run only the worker nodes (or, in fully-managed modes like GKE Autopilot and EKS Auto Mode, the cloud runs those too). The Kubernetes API surface is the same; the operational responsibility moves to the cloud. The shared-responsibility split is what makes this a security topic - see the Shared Responsibility section on this page.

Is a Kubernetes cluster a security boundary?

A namespace inside a cluster is not a security boundary - it's an organizational one. RBAC and NetworkPolicy can make it more boundary-like, but pods in different namespaces still share the same kernel on whatever node they happen to land on. The cluster itself can be a security boundary if you treat the cloud account it lives in as the actual perimeter - separate cluster per environment (prod, staging, sandbox) is the standard pattern for workloads with different trust levels. Do not mix prod and untrusted workloads in one cluster.

What is workload identity in Kubernetes?

Workload identity is the mechanism by which a Kubernetes pod gets short-lived cloud credentials scoped to itself rather than inheriting the broad role attached to the worker node. AWS calls it IAM Roles for Service Accounts (IRSA) and EKS Pod Identity; Azure calls it AKS Workload Identity (federated to Entra); GCP calls it Workload Identity Federation. All three work the same way: the pod presents a signed Kubernetes service-account token to the cloud, which trusts the cluster's OIDC issuer and returns a short-lived cloud credential. This is the single most important Kubernetes-on-cloud security control - it breaks the chain from 'compromised pod' to 'node IAM role.'

Why do people say Kubernetes is hard to secure?

Because the default posture is permissive. Pods can talk to any pod by default. Service accounts get a token mounted automatically. The cluster API server has its own RBAC that can grant access to everything. Worker nodes share a kernel across all pods on them. Add multiple namespaces, third-party operators, helm charts you didn't write, and a CNI plugin you barely understand - and the configuration surface is enormous. None of this is unfixable, but the default config is the worst case and security has to be opted into deliberately at every layer.

Kubernetes & Managed Kubernetes

A modern server room featuring network equipment with blue illumination — Photo by panumas nikhomkhai on Pexels

Last updated 2026-05-15 · By Shawn Nunley · Vendor-neutral · View source on GitHub

The 30-second version: Kubernetes orchestrates containers across a fleet of machines. Managed Kubernetes (EKS on AWS, AKS on Azure, GKE on GCP) hosts the control plane for you - you still own the workloads, the workload identity, the network policy, the RBAC, and the admission controls. The Kubernetes API surface is the same on every cloud; the configuration surface is enormous; and the default posture is permissive.

The two largest cloud-specific Kubernetes risks: identity chaining (a compromised pod stealing the worker node's cloud role via IMDS) and RBAC sprawl (some operator you installed has cluster-admin and can read every secret). Fix those first, then layer on network policy, admission control, and runtime detection.

What Kubernetes actually is
Why Kubernetes runs the cloud-native world
Shared responsibility - managed K8s edition
The Kubernetes threat model
Identity chaining - pod to node to cloud
RBAC and the API server
Pod security & escapes
Networking - flat by default
Admission control - the policy chokepoint
Cluster secrets & vault integration
Supply chain - what runs in your cluster
EKS, AKS, and GKE side-by-side
Hardening checklist
Common pitfalls
Further reading
FAQ

What Kubernetes actually is

Kubernetes is a cluster orchestrator. You declare what you want - "run three replicas of this container with these env vars and this much memory, expose port 8080, restart on failure" - and Kubernetes places that across a fleet of worker nodes, restarts containers that crash, redirects traffic when something fails, and rolls out new versions without downtime.

The architecture is split in two:

The control plane. The kube-apiserver (your front door, REST API), etcd (the cluster's source of truth - all configuration and secret material lives here), the scheduler (decides which pod goes on which node), and the controller-manager (reconciles desired state with actual state).
The data plane. Worker nodes - VMs (or bare-metal hosts) running the kubelet agent, a container runtime (containerd, CRI-O), the kube-proxy network agent, and your actual workload pods.

A pod is the unit of scheduling - one or more containers that share a network namespace and storage, scheduled together on the same node. A namespace is a logical grouping for pods, services, and policies. A service account is the pod's identity inside the cluster; combined with workload identity, it's also the pod's identity to the cloud.

None of this is cloud-specific - Kubernetes runs on bare metal, on your laptop (kind, minikube), on a Raspberry Pi cluster. But the dominant deployment shape today is managed Kubernetes in a cloud account.

Why Kubernetes runs the cloud-native world

Kubernetes hit critical mass in cloud because it solved problems the cloud's first-party services left open:

Cloud-portable workload definitions. A Helm chart or set of YAML manifests describes a workload identically on EKS, AKS, GKE, on-prem, and in a developer's kind cluster. Cloud-specific services (ECS task definitions, App Service deployment slots) don't move.
A single operational model for everything. Stateless apps, stateful databases, batch jobs, scheduled tasks, ML workloads - all use the same primitives. One operability story across a heterogeneous workload portfolio.
An enormous ecosystem. Service meshes, ingress controllers, GitOps controllers, secret managers, observability stacks - the CNCF landscape is roughly 1,500 projects, all interoperating through Kubernetes primitives.
Multi-tenant density at scale. Bin-packing many workloads onto fewer larger nodes is dramatically cheaper than one VM per service. At >50 services, the math is decisive.

The security cost: Kubernetes is also the largest source of misconfiguration in cloud environments. Wiz, Snyk, and Datadog all publish State-of-Cloud reports - every year, Kubernetes-specific misconfigurations top the list. The platform is powerful precisely because its configuration surface is huge; that surface is also the attack surface.

Shared responsibility - managed K8s edition

The single most useful diagram for cloud-Kubernetes security is the responsibility split. The cloud takes more on the higher-tier managed modes; the user always owns the workload layer.

Layer	Self-managed K8s on VMs	Standard managed (EKS / AKS / GKE Standard)	Fully-managed (GKE Autopilot / EKS Auto Mode)
Control plane	You	Cloud	Cloud
etcd encryption & backup	You	Cloud (with user-managed KMS option)	Cloud
Worker node OS patching	You	You (with managed node groups: cloud auto-upgrades but you trigger)	Cloud
Cluster networking (CNI)	You	You (cloud provides default; you can swap)	Cloud (Autopilot fixes choices)
Workload identity setup	You	You (cloud provides plumbing)	You
RBAC	You	You	You
Network policy / pod isolation	You	You	You (Autopilot enforces some defaults)
Admission control	You	You	You (Autopilot pre-installs some controls)
Pod / workload security context	You	You	You
Image security	You	You	You
Runtime threat detection	You	Cloud detection available; you install runtime agents	Cloud detection available; you install runtime agents

What this table shows: even on the most managed offering, everything above the node OS is yours. The cloud cannot configure your RBAC; it cannot decide which workloads should talk to which; it cannot decide whether a sidecar's permissions are reasonable. That is the security work of running Kubernetes in the cloud.

The Kubernetes threat model

Almost every Kubernetes-in-cloud breach narrative follows the same arc:

Initial foothold in a pod. Vulnerable web app, leaked credentials, supply chain attack in a base image. Any code-execution primitive will do.
Use the pod's service-account token to talk to the API server. Default token is auto-mounted into every pod. RBAC determines how far this gets - too often, to cluster-admin via some operator that was installed unaudited.
Reach the worker node's IMDS if workload identity isn't configured. Steal the node IAM role. Now you have the node's full cloud permissions - usually broad, because the node needs to pull images, mount volumes, write logs, manage load balancers.
Move laterally to other pods on the same node (shared kernel) or to other pods over the flat pod network. Default-allow networking makes this trivial.
Persist via a workload - deploy a new DaemonSet, create a backdoor service account, modify a ConfigMap. With cluster-admin, persistence is trivial.
Exfiltrate via the cluster's outbound network - typically unrestricted to the public internet.

Every defense on this page targets a step in that arc. Workload identity breaks step 3. Network policy breaks step 4. RBAC scoping breaks step 2 and step 5. Admission control prevents step 1's worst-case primitives (privileged pods, hostPath). Runtime detection catches steps that slip through.

Identity chaining - pod to node to cloud

This is the cloud-Kubernetes security failure that produces the worst outcomes. Same shape as the container version (see Containers), with two added wrinkles:

The default service-account token. Every pod gets a Kubernetes service-account JWT auto-mounted at /var/run/secrets/kubernetes.io/serviceaccount/token. It's the pod's identity to the API server. With default RBAC, that token can list pods in its own namespace - but if any RoleBinding or ClusterRoleBinding has elevated the default service account, that's the attacker's stepping stone.
The node IAM role. Worker nodes need an IAM role with permissions to pull images, attach volumes, write logs, register with the cluster's load balancer. That role is reachable via the node's IMDS - and from any pod with hostNetwork: true or no IMDS hop-limit enforcement.

The fix is workload identity

Every cloud now offers a per-pod identity that breaks both legs:

AWS - IAM Roles for Service Accounts (IRSA) - EKS cluster runs an OIDC issuer; you annotate a Kubernetes service account with an IAM role; pods using that SA exchange the SA token for a scoped role session. EKS Pod Identity is the newer, simpler alternative.
Azure - AKS Workload Identity - federated to Entra ID, same OIDC pattern.
GCP - GKE Workload Identity - Kubernetes SAs map to Google service accounts via Workload Identity Federation.

Once workload identity is configured, also block IMDS from pods. On EKS, enforce IMDSv2 with hop-limit 1 (so containers can't reach the metadata interface through the host). On AKS and GKE, the network plugin / hostNetwork enforcement does the equivalent.

Finally - disable automatic service-account token mounting for pods that don't need to talk to the API server. automountServiceAccountToken: false on every workload that isn't an operator. This removes one of the attacker's default tools.

RBAC and the API server

Kubernetes RBAC is the equivalent of cloud IAM, scoped to the API server. Roles grant verbs (get, list, create, delete) on resource types within a namespace. ClusterRoles grant them cluster-wide. RoleBindings and ClusterRoleBindings attach roles to users, groups, or service accounts.

Common ways RBAC goes wrong:

cluster-admin to a service account. Some Helm chart for a logging agent, monitoring operator, or autoscaler ships with a default ClusterRoleBinding granting cluster-admin "just to be safe." Now any pod compromised in that namespace is cluster-admin.
Wildcard verbs. verbs: ["*"] on resources: ["*"] is functionally cluster-admin in a different shape. Audit for these.
secrets read in unexpected places. Granting get on secrets is granting access to every credential in the namespace. Service accounts almost never need this; if they do, restrict to a specific secret name.
impersonate / escalate / bind. Any verb that lets a principal change its own permissions or impersonate someone else. Hard to audit, easy to abuse. Avoid.
Aggregated ClusterRoles. Kubernetes default ClusterRoles like view, edit, admin are aggregated - they automatically pick up new permissions from CRDs that opt in. A CRD installed later can silently expand what edit means.

What good looks like

One service account per workload. Workload-specific Role with the narrowest permission set that works. RoleBinding scoped to a single namespace.
Humans access via cloud SSO mapped to ClusterRoles, not via long-lived kubeconfig tokens.
Use kube-bench and KubiScan to audit RBAC for over-permissioned bindings.
API server audit logs flowing to your SIEM. The audit log catches "service account X created a pod with privileged: true" in a way that nothing else does.

Pod security & escapes

The escape paths covered on the Containers page all apply inside Kubernetes - privileged pods, sensitive capabilities, hostPath / hostPID / hostNetwork, kernel CVEs. Kubernetes adds tooling to enforce hardening cluster-wide:

Pod Security Standards (PSS)

Kubernetes ships three baseline standards: Privileged (no restrictions), Baseline (blocks the worst known escape paths - privileged, hostPath, hostPID, hostNetwork, hostIPC, certain capabilities), and Restricted (enforces hardened defaults: non-root user, read-only root filesystem, dropped capabilities, RuntimeDefault seccomp).

Apply with namespace labels - pod-security.kubernetes.io/enforce: restricted on production namespaces; baseline on namespaces that need slightly more flexibility; never privileged in prod. PSS replaced the deprecated PodSecurityPolicy and is built into modern Kubernetes (1.25+).

Per-pod hardening

The minimum hardened pod spec for production:

runAsNonRoot: true, runAsUser: <non-zero UID>
readOnlyRootFilesystem: true (with explicit emptyDir mounts where writes are needed)
allowPrivilegeEscalation: false
capabilities: drop: ["ALL"] with explicit adds only where needed
seccompProfile: type: RuntimeDefault
automountServiceAccountToken: false unless the workload calls the API
Resource requests and limits set (otherwise a noisy neighbor or runaway loop affects everything else)

Sandbox runtimes for high-risk workloads

For workloads that run customer-supplied or otherwise untrusted code, the host kernel is not enough isolation. Options:

GKE Sandbox (gVisor) - per-pod user-space syscall sandbox.
AKS Confidential Containers / Kata Containers - per-pod micro-VM.
EKS on Fargate - per-pod Firecracker micro-VM; the user can't even put two of their own pods on the same kernel.

The performance cost is modest; the escape-resistance gain is significant.

Networking - flat by default

Out of the box, every pod can talk to every other pod. The cluster's pod CIDR is flat - no segmentation between namespaces, no segmentation between trust levels, no segmentation at all. This is a Kubernetes design choice (it makes service discovery simple) and a security problem (a compromised pod scans the cluster).

NetworkPolicy

Kubernetes NetworkPolicy resources define allowed ingress and egress at the pod label level. The standard pattern:

Default-deny for all pods in production namespaces (no ingress, no egress).
Explicitly allow what's needed - "frontend can ingress from ingress-controller, egress to backend"; "backend can ingress from frontend, egress to database."
Always allow egress to kube-dns in the kube-system namespace, otherwise DNS breaks.
Always deny egress to the link-local range (169.254.0.0/16) - that's IMDS.

NetworkPolicy requires a CNI plugin that supports it. All three managed clouds support this (Cilium, Calico, AWS VPC CNI with restrictions, Azure CNI Overlay, GKE Dataplane V2). Verify before assuming.

Service mesh

NetworkPolicy is L3/L4 - IP and port. For L7 controls (HTTP method, path, header), and for mutual TLS between every workload, a service mesh adds an identity-aware proxy alongside every pod. Options:

Istio - comprehensive, opinionated, large operational footprint.
Linkerd - minimal, fast, Rust-based data plane, lower operational cost.
Cilium - eBPF-based, can do mesh and CNI in one project.
Cloud-native equivalents: AWS App Mesh, GKE Anthos Service Mesh, AKS Service Mesh add-on.

For a small cluster, NetworkPolicy is usually enough. For multi-team production with workloads at different trust levels, the mesh's mTLS and identity-based authorization are the right primitives.

Ingress / egress controls

Ingress. Public traffic enters via an ingress controller (NGINX, Traefik, Cilium, cloud-native ALB/AppGW/GCLB). The controller terminates TLS, enforces rate limits, and is the only thing on the public network.
Egress. Restrict outbound to an allow-list - your registry, your package managers, your APIs. NAT gateways with allowed-destination rules, mesh egress gateways, or Cilium's L7 egress policies all work.

Admission control - the policy chokepoint

Every change to the cluster flows through the API server. Admission controllers intercept those requests and can mutate or reject them. This is the chokepoint where you enforce "no privileged pods," "no images without signatures," "every namespace must have a NetworkPolicy" - at the place the cluster cannot bypass.

Built-in admission

Kubernetes ships several admission controllers; on managed clouds, the relevant ones are typically enabled by default. Pod Security Standards (covered above) is the most important built-in admission control.

Policy engines

For richer rules - beyond what PSS covers - install a policy engine:

Kyverno - Kubernetes-native (policies are Kubernetes resources, no separate language). Easier to adopt; the dominant choice in 2026.
OPA Gatekeeper - OPA + Rego. More expressive, steeper learning curve.

Policies worth running on day one

Reject pods with privileged: true, hostPID, hostNetwork, hostIPC, or hostPath mounts.
Reject pods that pull images from unapproved registries.
Require runAsNonRoot and readOnlyRootFilesystem.
Require resource requests and limits.
Require every namespace to have a default-deny NetworkPolicy.
Reject ClusterRoleBindings that grant cluster-admin to service accounts.
Require signed images (Binary Authorization on GKE; Kyverno + Cosign elsewhere).

Audit before enforce

Roll new policies out in audit mode first - log violations without blocking. Once the log is clean (or fixes have shipped), flip to enforce. This pattern avoids the "we deployed a policy and now half the cluster won't start" outage.

Cluster secrets & vault integration

A kind: Secret in Kubernetes is base64-encoded YAML, not encrypted. kubectl get secret foo -o yaml | base64 -d recovers the plaintext, and the same applies to anyone reading etcd directly. Production clusters need (1) etcd-level encryption-at-rest, and (2) an external source of truth for the secret values so rotation, audit, and lifecycle live in a real secrets manager - not in git and not in kubectl. The patterns below are typically combined, not picked individually. For the crypto and capability details behind the secrets managers themselves, see Data Security & KMS - Secrets management; for the broader workflow story, see IAM - Secrets in developer workflows.

1. Encrypt etcd at rest with a KMS provider

The minimum bar. Configure the API server's KMS encryption provider so Secret resources are encrypted with a customer-managed key before being written to etcd. Managed offerings make this one toggle:

EKS - Envelope encryption with AWS KMS, enabled at cluster creation (or via update). Pin the KMS key in IAM so only the cluster's control-plane role can use it.
AKS - KMS etcd encryption with Azure Key Vault; the cluster authenticates to Key Vault via managed identity.
GKE - Application-layer Secrets Encryption with a Cloud KMS key in the same project.

This protects against etcd-snapshot exfiltration, backup theft, and direct etcd reads. It does not protect against an attacker with API access - the API server will decrypt and serve. KMS encryption is necessary but not sufficient.

2. External Secrets Operator (ESO)

External Secrets Operator is the de-facto standard. You define an ExternalSecret CRD that references a value in an external store (AWS Secrets Manager, Azure Key Vault, GCP Secret Manager, HashiCorp Vault, 1Password, Doppler, Infisical, Akeyless, GitHub, and many more), and ESO materializes it as a Kubernetes Secret, refreshing on a schedule. The cluster doesn't own the secret - the secrets manager does, and ESO is the synchronizer.

Strengths - works with the existing app/Helm chart pattern (consume a Kubernetes Secret as env/volume), broad provider support, no app-code changes, refreshes give rotation for free.
Tradeoff - the secret does land in etcd (encrypted via the KMS provider, ideally) and is reachable via the K8s API for anyone with RBAC.
Auth pattern - the ESO controller's service account uses IRSA / Workload Identity / Azure Workload Identity to authenticate to the external store. No static secret to start the chain.

3. Secrets Store CSI Driver

Secrets Store CSI Driver mounts secrets directly from the external store as a tmpfs volume in the pod - they never become K8s Secret objects, never hit etcd. Provider plugins exist for AWS, Azure, GCP, and Vault.

Strengths - lowest attack surface (no etcd footprint, no K8s API path), pod-scoped access enforced at mount time.
Tradeoff - apps consume secrets as files instead of env vars (some retrofit needed); rotation requires the optional syncSecret sub-feature or pod restart.

4. Vault Agent Injector

Vault Agent Injector is a mutating webhook that adds a Vault Agent sidecar (or init container) to annotated pods. The agent authenticates to Vault using the pod's service-account JWT, fetches secrets, and writes them to a shared volume (or templated config file). Right answer when the team already runs Vault and wants dynamic secrets (DB credentials minted per-pod, expiring with the pod).

5. SOPS for GitOps

SOPS encrypts the values inside YAML / JSON / env / ini files using a cloud KMS key (or age / PGP), leaving the keys readable so diffs make sense. The encrypted file is safe to commit to git. Flux and Argo CD both have first-class SOPS integration: the reconciler decrypts at apply-time using its workload identity. Right answer for GitOps-first teams that want declarative cluster state including secrets, with the source of truth in git.

6. Sealed Secrets

Bitnami Sealed Secrets is the older "encrypt-to-the-cluster" pattern - a controller in-cluster holds a private key, anyone can encrypt against the matching public key, only that controller can decrypt. Simple to operate, but the keypair is cluster-bound (multi-cluster requires re-sealing) and the controller is a single point of failure for decryption.

7. SPIFFE / SPIRE for cross-environment workload identity

SPIFFE / SPIRE issues short-lived X.509 or JWT identities (SVIDs) to workloads based on attestation (which node, which pod, which service account). The cross-environment story is the point: a workload in Kubernetes, a workload on a VM, a workload on-prem can all hold a SPIFFE identity that downstream services trust uniformly. Right answer for multi-cluster, multi-cloud, or hybrid-on-prem deployments where you want one identity fabric instead of one per platform.

Picking among them

Production cluster, secrets in a cloud secrets manager: KMS etcd encryption plus External Secrets Operator or Secrets Store CSI Driver. Pick CSI if you can refactor apps to read from files; ESO otherwise.
GitOps with secrets in repo: SOPS with Flux/Argo (multi-cluster friendly) or Sealed Secrets (single-cluster simpler).
Vault shop wanting dynamic secrets: Vault Agent Injector or CSI driver with the Vault provider.
Multi-cluster / hybrid identity: SPIRE underneath any of the above.
Any of the above without etcd encryption enabled: incomplete. Turn on KMS encryption first.

What about `kubectl create secret`?

Fine for ephemeral demos and bootstrap. Not fine as the source of truth for anything in production. The path to "we don't know what's deployed" starts with a human running kubectl create secret at 2am to fix an outage; the secret is now nowhere except etcd, and nobody knows it exists six months later when the rotation is overdue. Every production secret should originate in a secrets manager and arrive in the cluster via ESO / CSI / Vault Agent.

For pipeline-side patterns (how the secret gets into the manager in the first place from CI), see CI/CD - Secrets in pipelines.

Supply chain - what runs in your cluster

The cluster runs more than your code. Every Helm chart you install, every operator, every CRD, every add-on (logging, metrics, ingress, secrets, GitOps) is third-party code with cluster permissions. The supply chain extends beyond your own image build.

Audit operators. Each operator installs CRDs, RBAC, and pods. Read the RBAC manifests before helm install. If the chart grants cluster-admin to a service account, ask why before agreeing.
Sign and verify images. Covered on the Containers page. In Kubernetes, the verification happens at admission: Binary Authorization on GKE, Kyverno + Cosign on EKS/AKS.
Pin to digests, not tags. Helm charts that use floating tags re-pull a different image silently. Pin image: registry/repo@sha256:abc... in your overrides.
Maintain an SBOM database. For both your images and the third-party ones running in your cluster. When CVE-2024-X drops, the question "is anything vulnerable" is one query.
Scan workloads continuously. Not just at build - periodically rescan everything running in the cluster against the latest CVE feed.

Man analyzing business data and financial graphs on a laptop — Photo by Kaboompics on Pexels

EKS, AKS, and GKE side-by-side

All three managed Kubernetes services run upstream Kubernetes - same API, same primitives, same kubectl. The differences are in the managed scope, the cloud integrations, and the operational defaults.

Building block	EKS	AKS	GKE
Control plane management	Cloud (HA across 3 AZ)	Cloud (free tier; paid for SLA)	Cloud (regional or zonal)
Fully-managed mode	EKS Auto Mode, Fargate profiles	Container Apps (separate product)	GKE Autopilot
Workload identity	IRSA, EKS Pod Identity	AKS Workload Identity (Entra federated)	GKE Workload Identity
Default CNI	AWS VPC CNI (pod = VPC IP)	Azure CNI Overlay or kubenet	GKE Dataplane V2 (Cilium-based) or routes
NetworkPolicy support	VPC CNI w/ policy or Cilium	Calico, Cilium, Azure NPM	Built-in via Dataplane V2
Image admission	Kyverno / Gatekeeper + AWS Signer	Image integrity policies, Defender for Containers	Binary Authorization (native)
Runtime threat detection	GuardDuty EKS Protection & Runtime Monitoring	Defender for Containers	SCC Container Threat Detection
Audit log destination	CloudWatch Logs	Azure Monitor / Log Analytics	Cloud Audit Logs
Private control plane	Private endpoint option	Private cluster option	Private cluster option
Auto-upgrade	Managed node groups with maintenance windows	Auto-upgrade channel	Release channels (rapid / regular / stable)

For teams new to Kubernetes, fully-managed modes (GKE Autopilot, EKS Auto Mode + Fargate, AKS + Container Apps for workloads that fit) remove enormous classes of operational and security work - no node patching, no privileged-pod admission decisions to make, no CNI choice to second-guess. The trade-off is reduced flexibility; for most application workloads, that's a feature.

Hardening checklist

The non-negotiable Kubernetes-in-cloud hardening list:

Identity

Workload identity (IRSA / WIF) for every pod that talks to the cloud. IMDS blocked from pods. Default SA token mounting disabled. Cloud SSO for human cluster access.

RBAC

One service account per workload. Narrowest possible Role; no wildcards. ClusterRoleBindings audited. API server audit logs flowing to SIEM.

Pod & node

Pod Security Standards restricted on prod namespaces. Sandbox runtime (gVisor / Kata / Fargate) for untrusted workloads. Nodes auto-patched on a defined cadence.

Network & admission

Default-deny NetworkPolicy. Egress allow-list. Admission control (Kyverno) rejecting risky configs. Signed-image enforcement at admission.

Add: runtime detection (Falco, Tetragon, your CNAPP runtime module); centralized container and audit logs; image scanning at build and registry; an SBOM database for fast CVE response; cluster autoscaler limits to cap blast radius from runaway pods.

Common pitfalls

One giant cluster for everything. Prod, staging, sandbox, untrusted dev workloads all on one cluster. A compromise crosses every trust boundary. Multi-cluster (or at least one cluster per environment) is the standard pattern.
cluster-admin granted to operators. The fastest path to an over-permissioned cluster. Audit every chart's RBAC before installing.
No workload identity. Pods inherit the node's broad cloud role. Capital-One-style identity chaining is one app exploit away.
Pod Security Standards not enforced. "We mean to" doesn't count. PSS is two namespace labels; apply them.
No NetworkPolicy. Flat networking = lateral movement on day one.
Admission control in audit-only forever. If violations are never blocked, the policy is documentation, not control.
Treating namespaces as security boundaries. They're not. Use clusters for hard boundaries; namespaces for organization.
Skipping audit log analysis. The Kubernetes audit log answers most "what happened" questions if it's flowing somewhere. Wire it to your SIEM on day one.
Letting the cluster diverge from IaC. Hot-fixes made with kubectl edit are silent drift. Run all changes through GitOps (Argo CD, Flux) so the cluster's state is auditable.
Ignoring node-level concerns on "fully managed." Even on Autopilot, you own the workload security context. The cloud cannot stop you from running privileged: true if your namespace allows it.

FAQ

Do I need Kubernetes at all?

Often, no. If you have one or two workloads, a serverless container service (Cloud Run, Fargate, Container Apps) gives you containerized deployment without operating a cluster. The decision point is around 5-10 distinct workloads with shared infrastructure needs (service-to-service identity, scheduled jobs, multi-region rollout). Below that, Kubernetes is overhead; above it, the operational model pays back.

EKS vs AKS vs GKE - which is "most secure"?

The Kubernetes API surface is the same on all three. GKE has historically led on built-in defaults - Workload Identity, Binary Authorization, GKE Sandbox, Autopilot - but EKS and AKS have closed the gap and offer equivalents. The bigger driver is "which cloud do you already live in?" - running EKS from an Azure organization or GKE from AWS adds federation complexity you don't need.

Should I use Helm?

For deploying off-the-shelf software (a database operator, an ingress controller), Helm is the lingua franca and you'll use it whether you like it or not. For your own workloads, Helm or Kustomize or raw YAML are all defensible. The security-relevant choice is "are these manifests in git, reviewed, and deployed by a pipeline" - not which tool generated them.

What about GitOps?

GitOps controllers (Argo CD, Flux) pull desired state from a Git repo and reconcile it into the cluster. Two big security wins: the cluster never trusts the CI runner (only itself and the Git repo), and every cluster change is in git history. For production workloads, GitOps is the right deploy model in 2026. See the CI/CD page for how it sits next to the build pipeline.

Are CRDs (custom resources) a security risk?

The CRD itself is data - it doesn't run code. But CRDs are usually paired with operators (controllers) that do run code, often with broad cluster permissions. Auditing CRD-bearing operators for RBAC scope and image provenance is the same exercise as auditing any third-party software with cluster-level access.

How does this relate to zero trust?

Kubernetes plus a service mesh is one of the cleanest places to put zero-trust principles into production. mTLS between every workload (verify explicitly), per-pod workload identity (least privilege), default-deny network policy (assume breach), continuous runtime visibility (verify continuously). The cluster gives you the primitives; the configuration is the work.

Where next

Containers & cloud security - what runs inside the pods.
CI/CD for cloud deployments - the pipeline that ships into the cluster.
Landing zones - the foundation the cluster lives in.
CSPM vs CNAPP - the tools that watch clusters in production.
Friday Zoom - Kubernetes war stories are a regular topic. Drop in.