The 30-second version: In cloud, identity is the perimeter. Every action is an authenticated API call, every misconfigured trust is a foothold, every overprivileged role is a blast radius. The work of cloud IAM is to (1) federate humans through one identity provider so credentials are short-lived and revocable, (2) eliminate long-lived workload secrets by adopting platform workload identity (IRSA, GKE Workload Identity, Azure Workload Identity, OIDC trust for CI/CD), (3) continuously narrow permissions using the cloud's own analyzers plus a third-party tool when the volume demands it, and (4) detect identity abuse with high-signal alerts that fire on the post-compromise behaviors attackers exhibit.
The trap is treating IAM as a one-time configuration. It's a discipline - drift, role proliferation, exception accumulation, and shadow identities (service accounts created in side projects, OAuth apps consented to, federated trusts added to integrate a new SaaS) all bend the curve away from least privilege every week. The mature program treats identity like a code base: reviewed, versioned, tested, and monitored.
On this page
- What IAM is in cloud
- Identity types
- Federation & SSO
- RBAC vs ABAC vs ReBAC
- Least privilege & permissions analyzers
- JIT access & PAM
- Workload identity (no more keys)
- Non-human identities (NHI)
- Secrets in developer workflows
- Privilege escalation paths
- MFA & phishing-resistant auth
- Detection signals for identity abuse
- AWS, Azure, and GCP side-by-side
- Maturity stages
- Common pitfalls
- Further reading
- FAQ
- Where next
What IAM is in cloud
IAM - Identity and Access Management - is the set of systems that answer two questions on every cloud API call: who is this, and are they allowed to do this. The first is authentication (AuthN); the second is authorization (AuthZ). In cloud, both happen at the control plane on every action, not once at the front door, which is the central difference from the on-prem AD-centric model many practitioners came up in.
The on-prem model is roughly: a user logs into a workstation, the workstation gets a Kerberos ticket from a domain controller, the ticket grants implicit trust to most internal resources behind the same firewall. Network position is most of the access decision; identity is a constant inside the perimeter. The cloud model has no perimeter - every call to the AWS, Azure, or GCP API is a fresh authorization decision evaluated against an explicit policy. There is no "behind the firewall"; an IMDS-exposed role abused from a compromised pod talks to the same control plane as the security team's own console session.
The practical implications: every workload needs an identity (no shared service accounts), every policy is auditable (every Allow / Deny is a log line), every change is an event (a new role assumption is detectable in real time), and every credential is interesting to an attacker (long-lived keys leak; short-lived tokens are the only safe pattern). The IAM team's job is to make that surface area legible, narrow, and observable.
Identity types
The first taxonomy to internalize: cloud IAM serves three distinct populations, with different lifecycles, threat models, and tooling.
Human identities
Workforce - employees, contractors, vendor partners. Lives in your corporate IdP (Okta, Entra ID, Google Workspace). Federated into every cloud, SaaS, and internal app. Customer (CIAM) - end users of your product. Auth0, Cognito, Entra External ID, Firebase Auth, Stytch, WorkOS. Different scale (millions of users, social login, account-recovery flows) and different threat model (credential stuffing, ATO). B2B - your customer's IT admins integrating with your product via SSO. SAML / OIDC inbound, SCIM provisioning, the messy world of "every customer wants a different IdP."
Workload identities
Anything that isn't a human but needs to call an API: a pod in Kubernetes, a serverless function, a CI/CD job, an on-prem agent, a third-party SaaS connector, a script running in someone's automation. Historically these used long-lived access keys; the modern pattern is short-lived tokens minted from a platform identity (IRSA, GKE Workload Identity, Azure Workload Identity) or from a federated OIDC trust. The volume here typically exceeds the human population 10-100×; most orgs have an order-of-magnitude underestimate of how many workload identities they have.
Service / principal identities
The cloud-native IAM constructs - AWS IAM users / roles / federated principals, Azure service principals / managed identities, GCP service accounts / workload-identity pools. These are the shape identity takes inside the cloud, regardless of whether the entity behind them is human, workload, or another service. The principal is what policies attach to and what logs reference. Most identity work in cloud is really principal hygiene: who owns this role, what touches it, when was it last used, is it still needed.
The three populations overlap in unintuitive ways. A workforce engineer logs into AWS via SSO (human → principal), pushes code via a CI/CD job that assumes a deploy role (workload → principal), and the deployed service runs as a pod-bound role that calls S3 (workload → principal). Three different identity types, one continuous chain of accountability if the logs and trust relationships are wired correctly. The most common audit gap is that the chain isn't continuous - a CI job runs as a generic shared role, so "who deployed this" can't be answered. Fix the federation, fix the audit.
Federation & SSO
Federation is the pattern that lets a single identity provider authoritatively assert "this is user X" to multiple downstream systems. SSO is the user-visible benefit: one login, many apps. The protocols and the IdP landscape:
SAML 2.0
The grandfather of enterprise federation. XML-based, browser-redirect flow, signed assertions. Still dominant for SaaS-to-SaaS federation in B2B contexts because every IdP and every enterprise app speaks it. Verbose, awkward to debug, vulnerable to subtle bugs in signature validation if you ever roll your own - which you shouldn't.
OIDC (OpenID Connect)
The modern, JSON / JWT-based federation protocol built on OAuth 2.0. Cleaner, mobile-friendly, easier to debug, and - critically for cloud - the protocol cloud providers trust for federating workloads from CI/CD systems, Kubernetes clusters, and on-prem agents. If you're integrating a new system in 2026, prefer OIDC; reserve SAML for legacy partners that don't speak OIDC.
SCIM (System for Cross-domain Identity Management)
The provisioning protocol that complements SAML / OIDC. SAML / OIDC authenticate sessions; SCIM keeps user lifecycles (create / update / deprovision) in sync between the IdP and downstream apps. Without SCIM, you authenticate a user successfully - and the destination app has no idea they exist, or worse, doesn't know they were terminated last quarter. The off-boarding gap is the most expensive consequence of skipping SCIM.
The IdP landscape
- Okta - the SaaS IdP market leader, especially for B2B SaaS and mid-market. Strong app catalog, mature SCIM connectors. Okta Workforce Identity Cloud for employees; Auth0 (Okta-owned) for CIAM.
- Microsoft Entra ID (formerly Azure AD) - dominant in Microsoft-centric enterprises and basically default in any org running Microsoft 365. Entra ID Governance, PIM, Conditional Access. Entra External ID is the Microsoft CIAM offering.
- Google Workspace - IdP for orgs that run on Google Workspace. Cloud Identity is the stand-alone IAM-only SKU. Strong integration with GCP Workload Identity Federation.
- Ping Identity - enterprise-grade, strong in financial services and regulated industries. PingOne, PingFederate, PingAccess.
- JumpCloud - directory + IdP + device management, popular with SMBs and growth-stage companies that want one tool for everything.
- Auth0 - developer-friendly CIAM, owned by Okta. The default choice for many product engineers building customer authentication.
- Open source - Keycloak, Ory, authentik, Zitadel. Worth considering when you want to own the identity layer or when CIAM cost-per-MAU at scale becomes painful.
One IdP, one source of truth
The architectural decision that pays the most dividends: pick one workforce IdP, federate every cloud and every SaaS through it, and treat that IdP's user lifecycle (joiner / mover / leaver via HRIS-driven SCIM) as the authoritative event stream for access. Multi-IdP environments - usually the result of acquisitions or org-chart politics - produce orphan accounts, missed deprovisions, and security exposure. Consolidate, even if it takes 18 months.
RBAC vs ABAC vs ReBAC
The three authorization models you'll encounter, and where each fits in a cloud environment.
| Model | How it grants access | Best for | Watch out for |
|---|---|---|---|
| RBAC | Permissions attach to named roles; subjects are assigned roles. | Cloud control-plane access, small-to-medium app permissions, anything an auditor needs to read at a glance. | Role explosion at scale (1 role per team × per env × per privilege level = thousands of roles). |
| ABAC | Policies evaluate attributes of subject, resource, action, and context - tags, claims, time, IP, MFA state. | Scale-out tenant data planes, multi-team cloud accounts using tags to scope access, per-environment guardrails. | Harder to reason about; one wrong tag flips the access decision. Strong tagging discipline is a prerequisite. |
| ReBAC | Access is a graph relationship - user is editor of doc, doc is in folder, folder is owned by team. | SaaS sharing models (Google Docs / Notion / Figma-style), hierarchical resources, fine-grained app permissions. | Newer pattern; tooling (OpenFGA, SpiceDB, Permify, Cerbos) is maturing but less embedded in cloud-native IAM than RBAC / ABAC. |
In practice, cloud-native IAM is RBAC at the surface with ABAC primitives layered in. You define roles (RBAC), and within the role's policy you use condition keys / aws:ResourceTag / aud / claims / scope to add attribute-based constraints (ABAC). The pure-RBAC org with thousands of roles is at one failure mode; the pure-ABAC org with one role and 200 condition keys is at the other. Most mature programs land in the middle: a small set of roles per persona, ABAC conditions inside each role to scope to the right resources.
ReBAC belongs inside the application, not in the cloud control plane. If you're building product authorization (who can see which document, which org's data, which row in this multi-tenant table), reach for OpenFGA, SpiceDB, Cerbos, or Permify. AWS Verified Permissions (Cedar) is the AWS-native answer in the same category. Don't try to encode product authorization in IAM policies; that's a path to permission sprawl and customer-visible bugs.
Least privilege & permissions analyzers
"Grant the minimum permissions required" is the most-repeated and least-implemented principle in cloud security. The reasons it's hard at scale: the cloud has thousands of distinct actions, your application uses an unknown subset, that subset changes as code evolves, and "what was used" requires log analysis nobody has time to do manually. The 2026 toolchain finally makes this tractable.
Native permissions analyzers
- AWS IAM Access Analyzer - multiple analyzers in one product: external-access (resources reachable from outside the account), unused-access (roles, users, keys, permissions not exercised in N days), policy validation (warns about over-broad policies during authoring), and policy generation (generates a right-sized policy from CloudTrail history). The unused-access analyzer is the single highest-leverage tool for shrinking permissions on roles that already exist.
- Azure Entra PIM - Privileged Identity Management. Makes admin roles eligible-but-not-active by default; activation requires justification, approval, MFA, and bounds the access window. The audit trail on activations is auditor-gold.
- GCP Policy Analyzer + IAM Recommender - answers "what permissions does principal X actually have on resource Y" (Policy Analyzer) and "which permissions hasn't this service account used recently, downsize to this smaller role" (Recommender).
Third-party permissions / CIEM tools
- Wiz - CNAPP with strong CIEM (Cloud Infrastructure Entitlement Management) - visualizes attack paths through identity, finds toxic combinations (over-permissioned identity + public exposure + sensitive data + vulnerable workload).
- Sonrai - identity-graph focused; built around the "effective permissions" calculation across complex AWS / Azure / GCP IAM models.
- Permiso - identity threat detection & response - what identities did, not just what they could do.
- ConductorOne, Veza, Opal - access governance / lifecycle / JIT request platforms that sit above the cloud-native IAM primitives.
- Prisma Cloud, CrowdStrike, Orca - CNAPPs with embedded CIEM modules.
- Open source - RepoKID (Netflix), CloudMapper, PMapper, iamlive. Useful starting points; less hands-off than the commercial tools.
The right-sizing loop
The operational practice that makes least-privilege real: (1) start a new workload with a deny-by-default role; (2) run it in staging with broad audit logging; (3) use a policy generator or iamlive to derive the minimum permissions actually used; (4) ship that minimum; (5) re-run the analyzer monthly and trim what falls into disuse. The pattern shrinks blast radius continuously without slowing engineers down - the analyzer does the work, the engineer reviews and approves the trim.
JIT access & PAM in cloud
Standing admin access is the highest-impact risk in any cloud environment. The fewer humans who hold standing admin rights, the smaller the blast radius when one gets phished. JIT - just-in-time - access is the pattern that collapses standing privilege to near-zero by making elevated access requestable, approvable, and time-bounded.
What JIT looks like by cloud
- AWS - IAM Identity Center permission sets configured with shorter session durations (1-4 hours) and approval gates. Third-party platforms (ConductorOne, Opal, Sym, Teleport) sit in front of permission set activation, capturing the request / approve / audit trail. Some shops build it from primitives - Slack request, Lambda activator, time-bound role assignment.
- Azure - Entra PIM is the native answer and is mature. Eligible vs active assignments, activation requires justification + MFA + optional approval, time bounds enforced. Pair with Conditional Access (require compliant device, location, etc.) on activation.
- GCP - temporary IAM grants with IAM conditions (expiry timestamp). Privileged Access Manager (PAM) is the newer native flow with request / approve / time-bound semantics.
PAM (Privileged Access Management) in cloud
Traditional PAM (CyberArk, BeyondTrust, Delinea) is built for credential vaulting in an on-prem world - store the admin password, broker the session, record everything. The cloud-native version is different: there are fewer secrets to vault (because federated SSO replaces passwords), but the brokering / recording / approval workflow still matters for sensitive operations. The 2026 PAM shape:
- Vault the few remaining long-lived secrets - root credentials, break-glass accounts, third-party API keys that can't be replaced with OIDC. Use AWS Secrets Manager / Azure Key Vault / GCP Secret Manager, or a cross-cloud vault (HashiCorp Vault, 1Password Secrets Automation, Doppler).
- Broker access to sensitive resources - production databases, cluster admin access, customer data systems. Teleport, StrongDM, Cyberark Conjur, Tailscale SSH, plus identity-aware proxies (BeyondCorp Enterprise, AWS Verified Access, Cloudflare Zero Trust). The broker enforces JIT, records the session, and emits the audit log.
- Break-glass procedures - every cloud needs a documented, tested path for "the IdP is down and we need to access AWS / Azure / GCP root." Sealed root credentials, multi-person approval to unseal, audit of every use. Test this every six months; the time you need it is not the time to discover the runbook is stale.
Workload identity (no more long-lived keys)
Long-lived access keys are the most common source of cloud breaches. Every leaked GitHub commit with an AKIA-prefixed key, every Stack Overflow paste with a service-account JSON, every disgruntled-employee export - long-lived credentials get exfiltrated and reused. The fix is to eliminate them, not to rotate them harder.
Inside the cloud - platform workload identity
- AWS EC2 / ECS / Lambda - IAM roles attached to the compute. The SDK retrieves short-lived credentials from the instance metadata service (IMDSv2 - never IMDSv1) or the task / function metadata endpoint. No keys on disk, ever.
- AWS EKS - IRSA (IAM Roles for Service Accounts) - Kubernetes service accounts mapped to IAM roles via OIDC. Pods retrieve credentials via the SDK using a projected service-account token. The successor pattern is EKS Pod Identity, simpler to configure for net-new clusters.
- Azure - Managed Identities for VMs / App Service / Functions / Container Apps. For AKS, Azure Workload Identity (the successor to AAD Pod Identity) federates Kubernetes service accounts to Entra identities via OIDC.
- GCP - service accounts attached to Compute Engine / Cloud Run / Cloud Functions. GKE Workload Identity federates Kubernetes service accounts to GCP service accounts. Workload Identity Federation extends the same pattern to workloads running outside GCP.
Outside the cloud - OIDC trust for CI/CD and on-prem
The pattern that eliminated the last common need for long-lived cloud credentials: every major CI/CD platform now publishes OIDC tokens that AWS / Azure / GCP can trust as federated identities. No secret to leak; the trust is a verifiable JWT signed by the CI provider for that specific job.
- GitHub Actions -
id-token: writein the workflow permissions, configure an OIDC provider in AWS (or federated credential in Entra / Workload Identity Pool in GCP) that truststoken.actions.githubusercontent.com, scope the trust to specific repos / branches / environments. GitHub OIDC docs. - GitLab CI - analogous with
id_tokensin the job. GitLab OIDC docs. - CircleCI - OIDC tokens via the OIDC contexts feature.
- Buildkite, Jenkins, Argo, Tekton - all support OIDC patterns; configuration varies. Vault and SPIFFE / SPIRE bridge the gap for workloads that don't have native OIDC.
- On-prem agents and third-party SaaS - Workload Identity Federation (GCP) and similar AWS / Azure mechanisms federate an external OIDC issuer (Okta, Auth0, Vault, SPIRE) so the on-prem workload presents an OIDC token and gets short-lived cloud creds.
The metrics to chase
Two numbers that should both trend to zero: count of IAM users with access keys (every key is potential exposure) and median age of access keys (older keys are likelier to have been exposed). A mature program ends with a handful of break-glass IAM users, every other principal a role / service account / federated identity, and no key older than 30-90 days.
Non-human identities (NHI) - the sprawl problem
"Workload identity" (the previous section) is one slice of a much bigger category: non-human identities - every identity in your environment that isn't a person. The cloud-native workload identities you assign to pods, functions, and VMs are NHIs. So are the OAuth tokens your SaaS apps trade with each other, the API keys your finance team pasted into Zapier two years ago, the GitHub personal access token used by an Ansible runbook, the certificate that lets your CI runner deploy to Kubernetes, the service principal the third-party DLP scanner uses to read every mailbox in Microsoft 365, the agent identity the new internal AI assistant authenticates as, and the long tail of bots, scripts, and integrations no one remembers creating. NHI as a discrete security category exploded between 2023 and 2026 because the volume crossed the threshold where humans could no longer track it manually.
Why NHI is now the dominant identity problem
- Ratio. Industry surveys (CyberArk, Astrix, Entro, Oasis) consistently put the NHI-to-human ratio in modern cloud-native estates between 45:1 and 100:1, and trending up. A 1,000-person company often has 50,000+ NHIs and growing weekly.
- Lifecycle gap. Humans have HRIS-driven joiner/mover/leaver flows; NHIs don't. The CI/CD service account a contractor created in 2022 still works in 2026 because nobody knew to disable it when the project shipped. Most NHI breaches start with an identity nobody remembered existed.
- Ownership gap. Ask "who owns this service account" for a randomly-sampled NHI and the typical answer is silence or a former-employee's name. No owner means no rotation, no review, no incident response.
- Secret sprawl. NHI credentials live in vaults and code and CI environment variables and Slack DMs and Notion pages and personal
.envfiles and Postman collections. Every additional copy is a leak path. - Cross-system blast radius. A single OAuth token can grant a SaaS app access to your entire mailbox, drive, calendar, or codebase - and the consent screen typically buried that scope in eight lines of grey text.
- AI agents. The 2025-2026 explosion of autonomous agents (LangChain, AutoGen, OpenAI Assistants, Claude Agent SDK, internal copilots) added a new NHI class: identities that take dynamic, model-generated actions on behalf of humans. They need scoped credentials, auditable action trails, and JIT just like any service account - but most are launched with whatever token was handy.
What counts as an NHI
Cloud-native NHIs
AWS IAM users / IAM roles / instance profiles, Azure service principals / managed identities / federated credentials, GCP service accounts / workload-identity pools, Kubernetes service accounts. The ones the previous section covers - native to the cloud control plane and (mostly) replaceable with short-lived federated tokens.
SaaS-to-SaaS NHIs
OAuth apps consented to in Google Workspace / Microsoft 365 / Slack / GitHub. API keys in Stripe, Datadog, PagerDuty, Snowflake, Salesforce, HubSpot. Webhook signing secrets. The Okta/Cloudflare incident, the Microsoft Midnight Blizzard test-tenant compromise, the Dropbox Sign breach, and most "third-party SaaS got popped, blast radius hit us" incidents live in this category.
Workload & agent NHIs
CI/CD tokens (GitHub PATs, GitLab deploy tokens, CircleCI keys), bot accounts in chat / ticketing / on-call, RPA bots, internal scripts, scheduled jobs, AI agent identities. High-volume, high-churn, low-visibility - exactly the population legacy IGA tools were not built for.
The recent breach pattern
The 2023-2026 incident catalogue is heavy on NHI-origin breaches: Cloudflare's October 2023 incident traced to leaked Okta service tokens; Microsoft's Midnight Blizzard compromise via a legacy OAuth app with elevated mailbox scopes; the Internet Archive's 2024 breach via an exposed GitLab token; Dropbox Sign via a compromised production service account; New York Times source-code exposure via a GitHub PAT; Cloudflare again via Atlassian tokens stolen in the Okta support-case incident; and a steady drumbeak of TruffleHog / detect-secrets scans finding credentials in public repos. The common thread: an NHI was created for a real reason years ago, never reviewed, the secret leaked through one of N copies, and the blast radius was wide because nobody had scoped the original grant.
The NHI program - what "owning this" looks like
- Inventory. Across every cloud account, every SaaS, every code repo, every secret vault, every CI system. The first run of an NHI inventory typically surprises the security team with an order-of-magnitude higher count than expected.
- Ownership. Every NHI gets a named human owner and a documented purpose. No owner = candidate for deletion after a deprecation window.
- Classify by risk. What can this NHI do, and against what data? An NHI with
mail.readon every mailbox in your tenant is critical; one that posts to a single Slack channel is not. - Replace static secrets with short-lived tokens wherever the platform supports it. Cloud workloads → workload identity. CI/CD → OIDC trust (see the previous section). SaaS-to-SaaS where the provider supports OIDC or mTLS instead of API keys → use it.
- Vault the rest. The static keys that can't be eliminated belong in a secrets manager with rotation policies, access audit, and break-glass procedures. No more
.envin Notion. - Rotate on a schedule and on signal. Time-based rotation for everything; immediate rotation on owner-change, project-end, or any leak signal.
- Detect anomalous NHI behavior. Volume spikes, new source ASNs, unusual API calls, dormant-then-active patterns, new OAuth consents on high-scope SaaS - same detection playbook as human identities but with different baselines (NHIs are usually deterministic; humans aren't).
- Off-board on every project / employee exit. Treat NHI deprovisioning as part of every leaver workflow and every project sunset. The orphan service account is the most-reused initial-access vector in this whole category.
NHI security vendors
The category has matured fast; expect consolidation through 2026-2027.
- Astrix Security - one of the first dedicated NHI platforms; strong on third-party SaaS app discovery, OAuth grant inventory, and posture across cloud + SaaS.
- Entro Security - secrets-and-NHI lifecycle management; inventory across vaults, code, CI, and cloud with anomaly detection.
- Oasis Security - NHI inventory, posture, and JIT issuance across cloud and SaaS.
- Aembit - workload-IAM platform for issuing short-lived credentials for workload-to-workload auth (the "OIDC-trust for everything that isn't CI" pattern).
- Token Security - machine-identity discovery, ownership attribution, and posture.
- Clutch Security - NHI-focused platform spanning discovery, lifecycle, and JIT.
- Britive - cross-cloud JIT for both humans and NHIs; one of the older entrants.
- Natoma - NHI governance and lifecycle.
- Andromeda Security - identity-security posture covering NHI alongside human identity.
- Adjacent CIEM + secrets - Wiz, Sonrai, Veza, and Permiso all have NHI inventory / detection capabilities embedded in broader products. Secrets-management leaders (HashiCorp Vault, CyberArk Conjur, Akeyless, Doppler, Infisical) are pushing into NHI lifecycle from the secret-store direction.
- Open source & OSS-adjacent - TruffleHog, detect-secrets, Gitleaks for secret discovery in code; SPIFFE / SPIRE for workload identity standards across heterogeneous environments.
The AI-agent wrinkle
AI agents are NHIs that take non-deterministic actions. A traditional service account does the same five API calls every hour; an agent might call any tool in its toolset depending on what the model decides. That breaks the usual anomaly-detection baselines - "this NHI just called an API it has never called before" is a normal Tuesday for an agent. The 2026 patterns emerging: scope each agent's credentials to the smallest possible toolset (per-agent service accounts, not one shared "AI" role), log every tool invocation with the user context that initiated it, route high-impact actions through a human approval step, and treat agent prompts as untrusted input that can exfiltrate or escalate via tool calls. Frameworks like OWASP LLM Top 10 and the emerging NIST AI risk-management guidance cover the threat model. See also AI/ML Security.
Secrets in developer workflows - IDEs, CLIs, Terraform, DevOps
Secrets managers and vaults are only as useful as the workflows that consume them. Identity is the auth story: every retrieval from a vault should authenticate with the developer's federated identity or the workload's platform identity - never with a static credential to fetch the other secrets. This section covers the integration patterns for the surfaces developers actually touch every day. For the underlying secrets-manager capabilities (rotation, dynamic secrets, versioning, audit), see Data Security & KMS - Secrets management; for pipeline-specific patterns see CI/CD - Pipeline secrets; for cluster patterns see Kubernetes - Cluster secrets.
Local development & IDEs - kill the .env file
The most common leak path on this whole page: a developer pastes a production API token into .env, the file gets committed, the secret-scanner catches it three commits later, the rotation cycle begins. The fix is to never let a static secret land on disk in the first place. The pattern: a small wrapper authenticates the developer to the vault using their SSO identity, fetches the secret at process-start, exports it to the child process's environment, and never persists it.
- AWS -
aws sso loginfor federated CLI access (no static keys); aws-vault wraps SDK calls in temporary STS credentials backed by your OS keychain;aws secretsmanager get-secret-valuein a Secrets Manager Agent for local-dev parity. - Azure -
az loginfor federated CLI;DefaultAzureCredentialin the SDK transparently uses developer login, managed identity, or environment vars in that order;az keyvault secret showfor ad-hoc retrieval. - GCP -
gcloud auth login+gcloud auth application-default login(the latter for SDK apps);gcloud secrets versions access; service-account impersonation (--impersonate-service-account) instead of downloading SA keys. - 1Password CLI (
op run) - reference secrets in.envasop://Vault/Item/field;op run --env-file=.env -- npm startresolves them at process-start. Works with any vault you keep in 1Password; popular for SaaS API keys that don't live in a cloud secrets manager. - Doppler CLI (
doppler run), Infisical CLI, Akeyless CLI - same pattern; the vendor differences are mostly the backend, integration breadth, and pricing. - Vault Agent on the workstation - templated config files materialized from Vault, auto-refreshed before TTL expiry. The right answer when the team already runs Vault.
- IDE integrations - 1Password VS Code extension (inline secret references, secret-scanning on save), AWS Toolkit, Azure Tools, Cloud Code for VS Code/JetBrains all retrieve secrets via the user's federated identity. Gitleaks / TruffleHog as pre-commit hooks catch the rare slip.
- Direnv + cloud SSO -
.envrcwithaws-vault execorop run --env-fileautomatically loads scoped credentials when youcdinto a repo, unloads when you leave. Eliminates "wrong profile" mistakes.
Terraform & OpenTofu - dynamic credentials and vault-sourced inputs
Two distinct concerns: (1) how Terraform itself authenticates to the cloud provider, and (2) how Terraform retrieves secret values it needs to pass to resources. Both have mature answers that avoid static secrets.
Authenticating Terraform to the cloud
- Terraform Cloud / Terraform Enterprise dynamic provider credentials - HCP Terraform mints an OIDC token, the AWS / Azure / GCP provider trusts it, the run gets short-lived credentials scoped to a specific workspace. No long-lived keys in the workspace settings. Same pattern GitLab and GitHub Actions use to authenticate Terraform runs.
- Spacelift, env0, Scalr, Digger - third-party Terraform/OpenTofu collaboration platforms with the same OIDC-to-cloud pattern.
- Local
terraform plan- use the same federated CLI auth above (aws sso login,az login,gcloud auth application-default login); never paste credentials into~/.aws/credentialsmanually.
Pulling secret values into Terraform configurations
- Vault provider -
data "vault_kv_secret_v2"reads at plan-time;data "vault_generic_secret"for older paths; supports dynamic secrets engines (DB creds minted per-run). aws_secretsmanager_secret_version,azurerm_key_vault_secret,google_secret_manager_secret_version- cloud-native data sources for each provider.- Mark outputs
sensitive = true- Terraform redacts them in plan/apply output but they still land in state. State files must be treated as secrets: encrypted backend (S3 + KMS, Azure Blob + CMK, GCS + CMEK), strict IAM, no local state files for any environment with real secrets. - OpenTofu encryption - OpenTofu 1.7+ ships native state encryption, mitigating the "state is plaintext" risk when the backend's at-rest encryption isn't enough.
- Anti-patterns -
variable "db_password" { type = string }sourced fromTF_VAR_env var pulled from CI secret store. Works but routes the secret through CI, into Terraform state, and into provider logs. Prefer a data source that fetches from the vault at apply-time using the run's identity.
DevOps tooling - Ansible, Pulumi, Helm, scripts
- Ansible - Ansible Vault for in-repo encrypted vars; lookup plugins for HashiCorp Vault, AWS Secrets Manager, Azure Key Vault, GCP Secret Manager for runtime retrieval.
- Pulumi - native secrets support with cloud-KMS-backed providers and Pulumi ESC for centralized environment/secret management.
- Helm - never put plaintext in
values.yaml. Use the helm-secrets plugin with SOPS, or render values from External Secrets Operator at deploy-time. See Kubernetes - Cluster secrets. - Shell scripts & cron - wrap with
aws-vault exec/op run/vault read; if it has to run unattended on a server, give the host a workload identity (IAM role, managed identity, GCP SA, SPIRE) and let the SDK pick it up. Static keys on hosts are the easiest thing in this whole list to eliminate.
The decision tree
- Can the platform mint a short-lived credential via workload identity or OIDC? Use that. (Cloud SDK on EC2 / AKS / GKE, OIDC trust for CI/CD, TFC dynamic provider credentials.) No secret to manage is the goal state.
- If not, can the secret be retrieved on-demand from a vault using the caller's identity? Use that. (External Secrets Operator, Vault Agent, AWS/Azure/GCP SDK against their native managers.)
- If a static value must live somewhere, is it in a secrets manager with rotation, audit, and access control? Use that. Never an env var checked into config, never a plaintext file on disk, never in chat.
- Pre-commit, pre-push, and CI scanners are the safety net, not the strategy. TruffleHog, Gitleaks, GitHub push protection, GitLab Secret Detection - all should be enabled, but if they're catching anything regularly, the workflow upstream needs fixing.
Common cloud privilege escalation paths
The defender's view of how attackers move from a low-privilege foothold to admin in cloud. The pattern names change by provider; the shape repeats.
AWS
- iam:PassRole abuse. The single most-abused permission. If the compromised principal can iam:PassRole a powerful role to a service it controls (Lambda, EC2, Glue, CloudFormation, SageMaker, dozens more), it effectively assumes that role. Defense: scope iam:PassRole with iam:PassedToService and explicit role-ARN allowlists. Audit with Access Analyzer's unused-permission view.
- AssumeRole chains. Role A trusts role B which trusts role C. The compromised B can chain to A or C if the trust policies are permissive. Defense: scope sts:AssumeRole trust policies tightly - specific principals, not Account: 123456789012; specific external IDs for cross-account; aws:SourceAccount / aws:SourceArn conditions to prevent confused-deputy attacks.
- Confused deputy via cross-account trust. An AWS service (a backup service, a monitoring SaaS) assumes a role in your account. If the trust policy doesn't pin the source account / external ID, a malicious tenant of that service can trick the service into acting against your account. Defense: aws:SourceAccount + aws:SourceArn + sts:ExternalId on every cross-account trust policy.
- iam:CreateAccessKey, iam:CreateLoginProfile, iam:AttachUserPolicy, iam:PutUserPolicy. The "let me make myself more powerful" permissions. Any of them on a principal that can target a more-privileged user is escalation. Defense: deny these as a service control policy unless explicitly scoped to a narrow admin role.
- IMDSv1 + SSRF. An app vulnerable to SSRF reads the EC2 instance metadata endpoint and exfiltrates the instance role credentials. Defense: enforce IMDSv2 (the token-based session protocol) account-wide; SCP-block IMDSv1 instances. Run
aws ec2 modify-instance-metadata-optionson every existing instance.
GCP
- Service account impersonation (iam.serviceAccounts.actAs, getAccessToken, generateAccessToken). If the compromised principal can impersonate a more privileged service account, it inherits that service account's permissions for the duration of the impersonation. Defense: narrow which principals have actAs on which service accounts; alert on impersonation events of high-value SAs.
- Service account key creation. iam.serviceAccountKeys.create on a privileged SA mints a long-lived key the attacker can exfiltrate. Defense: organization policy iam.disableServiceAccountKeyCreation at the org level; require a documented exception path.
- Workload Identity binding abuse. A compromised Kubernetes service account bound to a powerful GCP service account is, effectively, that GCP service account. Defense: principle of least privilege on Workload Identity bindings; one GSA per workload, narrowly scoped.
- Hierarchical inheritance. A role granted at the organization level inherits down. A misplaced "Owner" at org level is everywhere. Defense: grant at the lowest level that works (project or folder), audit org-level bindings monthly.
Azure
- Managed identity abuse. A VM with a User-Assigned Managed Identity that holds Contributor on the subscription - compromise the VM, get the subscription. The VM's IMDS endpoint mints tokens for the managed identity. Defense: scope managed identities to the minimum role on the minimum scope; never Contributor on a subscription for a workload.
- Service principal credential addition. Adding a new password or certificate to an existing service principal - and the resulting identity inherits the SP's app roles. Defense: alert on credential additions to privileged service principals; review SP credential lifetimes.
- Federated credential abuse. An Entra app's federated credentials allow OIDC tokens to authenticate as the app. A misconfigured federated credential - e.g., trusting
repo:*across all branches without environment scoping - lets a PR from a fork mint cloud creds. Defense: scope federated credentials to specific repos, environments, branches. - Owner / Global Admin separation. Azure RBAC (subscription Owner) and Entra ID admin roles (Global Admin) are independent in some configurations, conflated in others. A compromised Global Admin can grant itself Owner via the "Access management for Azure resources" toggle. Defense: PIM-protect both, separate the populations, monitor escalation paths.
Cross-cloud and federation
- OIDC trust misconfigurations. The most common 2026 pattern: a GitHub Actions OIDC trust configured with
repo:*instead ofrepo:org/repo:ref:refs/heads/main. Any PR - including from a fork - can mint cloud creds for the trusted role. Defense: always scope the OIDC subject claim to specific repo + branch / environment; prefer environment-scoped trust over branch. - SAML response tampering. Historic (CVE-2017-11427 and friends) - bugs in signature validation let attackers modify the assertion's user claim. Mostly patched, occasionally rediscovered in roll-your-own SAML libraries. Defense: use the IdP-provided SDK, not a homegrown parser.
- SCIM token leakage. SCIM provisioning tokens can typically create / modify users in the downstream app. A leaked SCIM token is account takeover for the entire tenant. Defense: rotate SCIM tokens; alert on user-creation outside expected HRIS sync windows.
For attack-path tooling beyond manual review, BloodHound (Cloud), PMapper, Pacu, and the CIEM tools listed earlier all visualize these paths automatically. Run them in offense mode against your own environment quarterly.
MFA & phishing-resistant authentication
"MFA enabled" is no longer a meaningful security control by itself. The factor type matters more than the presence of a second factor.
The factor ladder
- SMS / voice OTP - phishable, SIM-swappable, intercept-able via SS7. Use only as a fallback for accounts with no other risk exposure; never for cloud admins.
- TOTP (Google Authenticator, Authy, 1Password) - better than SMS, still phishable via real-time relay (the attacker proxies your TOTP through Evilginx / Modlishka). Acceptable for low-risk users; insufficient for admins.
- Push notifications - Duo Push, Microsoft Authenticator, Okta Verify. Vulnerable to MFA fatigue (spam the user until they tap). Number-matching mitigates most of that. Acceptable middle tier.
- FIDO2 / WebAuthn / passkeys - phishing-resistant. The credential is bound to the origin (the IdP domain), so a phishing site can't relay it. YubiKeys, Titan Keys, Touch ID, Windows Hello, platform passkeys all qualify. This is the 2026 minimum for any account with prod cloud access.
- Hardware-backed device certificates - for high-value access, the credential is sealed to a TPM / Secure Enclave, gated by user presence. The session is bound to the device; stolen tokens don't replay.
Conditional access
The factor is one input to the access decision; context is the other. Conditional access policies (Entra Conditional Access, Okta Adaptive MFA, Google BeyondCorp, Cloudflare Zero Trust) evaluate device posture, location, network, time, sign-in risk score, and recent behavior - and either allow, step-up (require an additional factor), or block. The shape of a mature policy: only managed compliant devices, only from expected geographies, step-up for admin scopes, block from anonymizing networks (Tor, known VPN exit nodes that match attacker telemetry).
Session binding
The 2026 attack pattern that bypasses MFA is session-cookie theft via infostealer (Lumma, RedLine, etc.). The user does FIDO2 successfully; the malware exfiltrates the resulting session cookie; the attacker reuses it from anywhere. Defenses: device-bound session credentials (DBSC), token-binding, continuous-access evaluation that revokes sessions on risk-score changes, and short session lifetimes for sensitive scopes. Expect session-theft to dominate the next round of cloud incidents until DBSC and equivalents are universally deployed.
Detection signals for identity abuse
The detections every cloud SOC should have running. None of these are exotic; the failure mode is rarely "we couldn't see it" - it's "we never wrote the rule." (See also: Cloud SOC.)
High-signal detections
- Impossible travel. Same user, two sessions in geographically incompatible locations within a window shorter than physical travel allows. Native in Entra ID Protection, Okta ThreatInsight, and most CASB / SIEM platforms.
- Console login from a new ASN / country / device. For privileged identities, "first-seen" is the alert. Suppress with a baseline of expected sources.
- GetCallerIdentity from an unusual context. The "who am I" reconnaissance call is one of the most reliable post-compromise indicators. A long-lived access key calling GetCallerIdentity from a new ASN is high-fidelity. (AWS-specific; analogs exist for Azure / GCP.)
- New AssumeRole / impersonation event for a high-value role. If only three principals should ever assume the org-admin role, the fourth is the alert.
- AssumeRole chain length anomaly. Role A assumed via B via C is normal in some workflows, unusual in most. Baseline and alert on chain length growth.
- MFA disabled / password reset / new access key / new federated credential. On any privileged principal, these are critical-severity. On standard principals, watch for clustering (the attacker hits ten in a row).
- OAuth app consent / new federated trust. New OAuth app consents in the IdP and new federated credentials on Entra apps / AWS OIDC providers / GCP workload-identity pools are the modern persistence mechanism. Alert and review.
- Privilege-escalation primitive use. iam:AttachUserPolicy, iam:PutUserPolicy, iam:CreateLoginProfile, iam:CreateAccessKey targeting another user; sts:GetFederationToken; resource-policy-write actions on critical resources. Catalog the actions; alert on any use outside well-known admin workflows.
- Service-account-key creation (GCP), federated credential addition (Azure), IAM user creation (AWS). Rare in mature orgs; high-value to alert on.
- Unused-then-used. A role that hasn't been used in 90 days, suddenly used. Dormant identity activation is a strong attacker signal.
Where the detections live
Feed CloudTrail (AWS), Activity Log + Sign-in Logs (Azure), and Cloud Audit Logs (GCP) plus the IdP audit log (Okta System Log, Entra Sign-in / Audit logs, Google Workspace Reports) into one of: a SIEM (Splunk, Elastic, Sumo Logic, Sentinel, Chronicle), a detection-engineering platform (Panther, Anvilogic, Hunters), or the cloud-native detection stack (GuardDuty + Detective, Defender for Cloud, Security Command Center + Chronicle). The detections above translate to all of them; the rule libraries (Sigma, Splunk SPL, KQL, YARA-L) are widely shared.
AWS, Azure, and GCP side-by-side
The native IAM constructs each cloud ships, reduced to a one-screen reference:
| Capability | AWS | Azure | GCP |
|---|---|---|---|
| Workforce SSO | IAM Identity Center (federated to Okta / Entra / Google) | Entra ID (native), federate external IdPs | Cloud Identity / Workforce Identity Federation |
| Authorization model | Identity-based + resource-based policies, JSON | Azure RBAC + Azure Policy, role definitions in JSON | IAM bindings (role-based) + IAM Conditions |
| Workload identity (VMs / functions) | IAM roles + IMDSv2 | Managed Identities (system / user-assigned) | Service accounts attached to compute |
| Kubernetes workload identity | IRSA / EKS Pod Identity | Azure Workload Identity (AKS) | GKE Workload Identity |
| External workload federation | IAM OIDC identity providers + AssumeRoleWithWebIdentity | Entra federated credentials on app registrations | Workload Identity Federation pools & providers |
| JIT / privileged access | IAM Identity Center session limits + third-party (ConductorOne, Opal, Sym, Teleport) | Entra Privileged Identity Management (PIM) | Privileged Access Manager + conditional IAM |
| Permissions / unused-access analysis | IAM Access Analyzer (external + unused + policy gen) | Entra Permissions Management (formerly CloudKnox), Access Reviews | IAM Recommender, Policy Analyzer, Policy Troubleshooter |
| Org-level guardrails | Service Control Policies, Resource Control Policies, Declarative policies | Azure Policy + Initiatives at mgmt-group scope | Organization Policy constraints |
| Activity audit log | CloudTrail (org trail) | Activity Log + Entra Audit / Sign-in Logs | Cloud Audit Logs (Admin / Data Access) |
| Identity threat detection | GuardDuty (IAM & S3 protection), Detective | Entra ID Protection, Defender for Identity, Defender for Cloud | Security Command Center, Event Threat Detection, Chronicle |
| Phishing-resistant MFA | FIDO2 security keys + passkeys via IAM Identity Center / IdP | FIDO2 + passkeys + Windows Hello via Conditional Access | FIDO2 + Titan keys + passkeys via Google Workspace |
| Break-glass & root | Root user - store credentials sealed; CloudTrail-alert on use | Global Admin emergency accounts (2+), excluded from MFA Conditional Access carefully | Super Admin emergency accounts, gcloud-restricted access |
The native tools are necessary but rarely sufficient at scale. Most orgs running multi-cloud reach for a CIEM (Wiz, Sonrai, Permiso, ConductorOne, Veza) to get a cross-cloud effective-permissions view; the native tools are best-in-class for their own cloud and weakest at "show me everything one principal can do across our entire estate."
Maturity stages
A useful staging model for a cloud IAM program:
Crawl - Federated & inventoried
One IdP for all humans, federated into every cloud. SCIM-driven joiner / mover / leaver. Long-lived access keys inventoried and aged out. MFA enforced (any factor) on all admins. CloudTrail / Activity Log / Cloud Audit Logs centralized. A first inventory of roles and service accounts exists.
Walk - Least privilege at rest
Permissions analyzers running monthly; unused permissions trimmed. CIEM or equivalent visualizes effective permissions across accounts. Workload identity adopted for most net-new workloads; access keys trending toward zero. Phishing-resistant MFA mandatory for admins. Basic identity-abuse detections firing into the SOC.
Run - JIT & engineered
Standing admin reduced to near-zero via JIT. SCPs / Organization Policies / Azure Policy guardrails block known-bad patterns at the API. OIDC trust for all CI/CD; no long-lived secrets in pipelines. Identity-threat detections (impossible travel, unused-then-used, escalation primitives) all running. Quarterly attack-path review with BloodHound / PMapper / CIEM.
Fly - Identity is a product
Identity team treats internal IAM as a paved-road service: self-service role requests, automated right-sizing, drift detection blocking PRs in IaC. Customer-facing identity (CIAM / B2B SSO) is differentiated as a competitive feature. Continuous-access evaluation, device-bound sessions, and just-enough-access reduce blast radius to single API operations.
The skip-stage cost is real here - JIT without federated SSO produces a Frankenstein of half-controls; CIEM without an IdP inventory produces noise without action. Sequence matters.
Common pitfalls
- Treating IAM as one-time configuration. Roles, trusts, and policies drift weekly as teams ship features. Without continuous review, your "least privilege" diagram is fiction by month three. Schedule the review; tool it.
- Wildcarding the OIDC subject claim. Trusting
repo:*orrepo:org/*in a GitHub-Actions OIDC trust lets any PR - including a fork from outside your org - mint cloud creds. Always pinrepo:org/repo:environment:prodorrepo:org/repo:ref:refs/heads/main. - Long-lived access keys on humans. No human in 2026 needs a static AWS access key. Federate through IAM Identity Center; if engineers need CLI access, use
aws sso login. Same logic for service-account JSON keys in GCP; usegcloud auth application-default loginor impersonation. - Over-broad iam:PassRole. Granting iam:PassRole "*" is a blank check for privilege escalation. Always pair with iam:PassedToService and explicit role-ARN allowlists. Audit existing grants quarterly.
- Confusing roles and groups. AWS IAM groups don't show up in CloudTrail; group-attached policies are invisible at audit time without an extra lookup. Use roles + permission sets for human access; reserve groups for bulk policy attachment, not as the identity primitive.
- Skipping the off-boarding test. "We have SCIM" doesn't mean deprovisioning works. Pick a terminated employee from six months ago; verify they can't sign into anything anywhere. The first run almost always finds at least one orphan.
- Shared service accounts. One "ci-deploy" service account used by every team is one compromise away from total compromise. One identity per workload. The "but that's a lot of identities" objection is solved by automation, not by sharing.
- Treating NHIs as a side problem. Most identity programs spend 90% of their effort on the 1% of identities that are humans. With NHIs at 50:1 ratio and rising, that allocation is upside-down. Inventory NHIs, assign owners, and apply the same lifecycle rigor (joiner/mover/leaver, rotation, off-boarding) you give to people. See the NHI section.
- OAuth-app consent without review. A user clicks "Allow" on a third-party SaaS that requests
mail.readacross the tenant, and the app now has eyes on every mailbox. Block end-user consent on high-risk scopes; require admin approval; review consented apps quarterly. - AI agents launched with whatever token was handy. The new "internal copilot" project ships using a developer's personal cloud credentials or a shared "ai-svc" account with broad permissions. Each agent needs its own scoped identity, a logged tool-invocation trail, and a human approval gate for high-impact actions.
- Trusting MFA without considering session theft. Phishing-resistant MFA stops credential phishing; it doesn't stop session-cookie theft via infostealer. Add device-bound session credentials, continuous-access evaluation, and short admin session lifetimes.
- Letting the root / Global Admin / Super Admin sprawl. The break-glass accounts should be a tiny, named, sealed-credentials set. Every additional Global Admin is an attacker's path. Audit monthly.
- Outsourcing identity to a SaaS without an audit log integration. If your IdP, your CIAM, or your B2B-SSO provider doesn't stream audit logs into your SOC, the detections aren't running. The biggest IdP breaches of the past few years were discovered by the SaaS provider, not by the customer; assume that's your data path too unless you've wired it differently.
Further reading
Specifications & standards
- RFC 6749 - OAuth 2.0
- OpenID Connect Core 1.0
- SAML 2.0 Core (OASIS)
- RFC 7644 - SCIM Protocol
- W3C WebAuthn Level 2
- FIDO Alliance specifications
- NIST SP 800-63 - Digital Identity Guidelines
- NIST SP 800-207 - Zero Trust Architecture
Provider docs
- AWS IAM User Guide
- AWS IAM Access Analyzer
- AWS IAM Identity Center
- Microsoft Entra ID documentation
- Entra Privileged Identity Management
- Google Cloud IAM documentation
- GCP Workload Identity Federation
Attack & offense research
- BloodHound (cloud attack-path graph)
- PMapper - AWS principal mapper
- Pacu - AWS exploitation framework
- Hacking the Cloud
- MITRE ATT&CK Cloud matrices
Related CSOH pages
- Zero Trust - the architectural pattern identity-driven security implements.
- Landing zones - where IAM gets baked in at organization birth.
- Kubernetes - workload identity, RBAC, admission control.
- CI/CD - OIDC trust and pipeline identity.
- Cloud SOC - where identity-abuse detections live.
- GRC - how IAM controls satisfy SOC 2 / ISO 27001 / PCI DSS.
- Glossary - every IAM term on this page, defined.
FAQ
Why is IAM the #1 root cause of cloud breaches?
Every cloud action is an authenticated API call, so identity is the perimeter. When an attacker compromises a credential - a leaked access key, a phished session cookie, a misconfigured OIDC trust - they inherit whatever permissions that identity holds, and most identities hold far more than they need. The Verizon DBIR, Mandiant M-Trends, and IBM Cost of a Data Breach reports all converge on the same finding: stolen or abused credentials are the top initial-access vector in cloud, ahead of vulnerability exploitation and phishing-of-endpoints. The fix is not a single product; it's a program - strong authentication for humans, short-lived tokens for workloads, least privilege enforced continuously, and detections that fire on identity anomalies.
What's the difference between RBAC, ABAC, and ReBAC?
RBAC (role-based) grants permissions to named roles, and assigns subjects to roles - simple, auditable, but combinatorially expensive at scale. ABAC (attribute-based) grants permissions through policies that evaluate attributes of the subject, resource, action, and context (e.g., "engineers can read S3 buckets tagged team=their-team in us-east-1 during business hours") - flexible, scales without role explosion, but harder to reason about. ReBAC (relationship-based, popularized by Google Zanzibar / OpenFGA / SpiceDB) models access as graph relationships ("user is editor of document, document is in folder, folder is owned by team") - fits SaaS sharing models and hierarchical resources. Most cloud platforms are RBAC at the surface (roles, policies) with ABAC primitives (tag / condition keys / IAM conditions) layered in. Use RBAC for cloud-control-plane access; ABAC for scale-out tenant data-plane decisions; ReBAC inside applications that model sharing.
How do I get rid of long-lived cloud access keys?
For humans: federate through your IdP (Okta, Entra ID, Google Workspace) into IAM Identity Center / Entra-based federation / Workforce Identity Federation. Humans get short-lived credentials via SSO; static keys go away. For workloads in cloud: use the native workload-identity primitives - IAM Roles for EC2 / Lambda, IRSA / EKS Pod Identity for Kubernetes on AWS, Managed Identities / Azure Workload Identity for Azure, GCE / GKE Workload Identity for GCP. For workloads outside cloud (CI/CD, on-prem agents, third-party SaaS): use OIDC trust - GitHub Actions, GitLab, CircleCI all publish OIDC tokens that AWS / Azure / GCP can trust without any pre-shared secret. Rotate the remaining long-lived keys aggressively, alert on creation, and track them to zero. The metric to chase is "count of access keys older than N days" trending toward zero.
Is MFA enough for cloud admin accounts?
MFA is necessary but not sufficient, and the type of MFA matters more than the presence of it. SMS-based codes are phishable and SIM-swappable. TOTP (authenticator apps) is phishable through real-time relay attacks (Evilginx-class kits). The bar for cloud admins in 2026 is phishing-resistant authentication - FIDO2 security keys, platform passkeys, or device-bound credentials via WebAuthn. Pair that with conditional access (only from managed devices, only from expected geographies, step-up auth for sensitive actions) and session-binding (tokens tied to the device that minted them). Treat "MFA enabled" as a compliance floor; treat "phishing-resistant for everyone with prod cloud access" as the actual security target.
What is iam:PassRole and why does it matter?
iam:PassRole is the AWS permission that lets a principal hand a role to a service to assume on its behalf. If you can pass a powerful role to a service you control (Lambda, EC2, SageMaker, Glue, CloudFormation, dozens more), you can effectively assume that role's permissions through the service. It's the single most-abused privilege escalation primitive in AWS - the equivalent of "I can't read this S3 bucket, but I can launch an EC2 instance with a role that can." The defense: scope iam:PassRole tightly using the iam:PassedToService condition, restrict which roles can be passed by ARN, and audit it with IAM Access Analyzer's unused-permission and external-access analyzers. Azure's analog is the ability to assign a managed identity with elevated rights; GCP's is the iam.serviceAccounts.actAs permission. The pattern repeats across all three clouds; the names differ.
What does just-in-time (JIT) access look like in cloud?
JIT access means standing permissions are minimal, and elevated permissions are requested per-need with an expiration. The mechanics vary by cloud and platform: AWS via IAM Identity Center permission sets + approval workflows or via tools like ConductorOne / Tenable Cloud Security / Sym; Azure via Entra PIM (Privileged Identity Management) with eligibility, activation, MFA-on-activation, and time bounds; GCP via temporary IAM grants with conditions or Privileged Access Manager. The audit trail captures who requested, who approved, why, and when access expired. The operational benefit is that the breach blast-radius from a compromised user is small most of the time - they only have elevated privileges during the window they actively requested. The maturity step beyond JIT is just-enough-access: even during the window, only the specific actions needed, not a generic admin role.
What are non-human identities (NHIs) and why are they a cloud security problem?
Non-human identities (NHIs) are every identity that isn't a person - service accounts, API keys, OAuth tokens, secrets, certificates, workload roles, third-party SaaS integration credentials, bots, and now AI agents. In a typical 2026 cloud estate, NHIs outnumber humans 45-100 to 1, almost none are tied to a clear owner, most have no defined lifecycle, and a meaningful percentage are over-permissioned. They are the dominant blast-radius surface and have driven a string of recent breaches - Cloudflare via leaked Okta service tokens, Microsoft via a legacy test OAuth app abused by Midnight Blizzard, Internet Archive via an exposed GitLab token, Dropbox Sign via a compromised service account, and the long tail of GitHub leaked AWS access keys. The 2026 program covers four things: (1) inventory every NHI across cloud, SaaS, and code, (2) assign each one an owner and a purpose, (3) put every secret in a vault with rotation and short-lived alternatives where possible (OIDC trust, workload identity), (4) detect anomalous NHI behavior - new consents, unusual call patterns, dormant-then-active. The category has a maturing vendor space: Astrix, Entro, Oasis Security, Aembit, Token Security, Clutch, Britive, Natoma, Andromeda Security.
How do I detect identity abuse in cloud?
The high-signal detections to build first: (1) console login or AssumeRole from a new ASN / country / device for a privileged identity; (2) GetCallerIdentity called by a long-lived access key from an unusual source - the "who am I, where are my permissions" reconnaissance step is reliable post-compromise behavior; (3) impossible-travel between sessions; (4) MFA-disable, password-reset, or new-access-key events on privileged accounts; (5) AssumeRole chain length anomalies (one role assumed via another via another); (6) new OAuth app consent grants in your IdP; (7) iam:PassRole / iam:CreateAccessKey / iam:CreateLoginProfile / iam:AttachUserPolicy / sts:GetFederationToken from non-admin paths; (8) service-account-key creation in GCP, federated-credential addition in Azure, IAM-user creation in AWS - all rarer than zero in a mature org, all high-value to alert on. Feed CloudTrail / Activity Log / Cloud Audit Logs into a SIEM or detection-engineering platform (Panther, Anvilogic, SnapAttack, native Sentinel) and tune from there.
Where next
- Zero Trust - identity is the foundation; this is how the rest of the architecture builds on it.
- Landing zones - bake the IAM model in at organization birth, not retrofit it later.
- Kubernetes - workload identity, RBAC, admission control, the cluster-side mechanics.
- CI/CD - where OIDC trust eliminates the last static secrets.
- Cloud SOC - the detections that fire when identity is abused.
- GRC - how IAM controls map to SOC 2 / ISO 27001 / PCI DSS evidence.
- Friday Zoom - IAM, federation, and identity-abuse incidents come up almost every week. Drop in.