IAM & Cloud Identity

Identity is the perimeter. The #1 root cause of cloud breaches isn't an unpatched server or a zero-day - it's a stolen credential, a misconfigured trust, or a role that holds more permissions than anyone remembers granting. Vendor-neutral guide to cloud IAM: identity types (human, workload, service), federation and SSO, RBAC vs ABAC vs ReBAC, least privilege at scale, JIT access, workload identity without long-lived keys, the non-human identity (NHI) sprawl problem that now outnumbers humans 50:1, the privilege escalation paths attackers actually use, phishing-resistant MFA, and the detections that fire when identity is abused.

A hand holding a padlock - identity is the new perimeter
Photo by PhotoMIX Company on Pexels

· · Vendor-neutral · View source on GitHub

The 30-second version: In cloud, identity is the perimeter. Every action is an authenticated API call, every misconfigured trust is a foothold, every overprivileged role is a blast radius. The work of cloud IAM is to (1) federate humans through one identity provider so credentials are short-lived and revocable, (2) eliminate long-lived workload secrets by adopting platform workload identity (IRSA, GKE Workload Identity, Azure Workload Identity, OIDC trust for CI/CD), (3) continuously narrow permissions using the cloud's own analyzers plus a third-party tool when the volume demands it, and (4) detect identity abuse with high-signal alerts that fire on the post-compromise behaviors attackers exhibit.

The trap is treating IAM as a one-time configuration. It's a discipline - drift, role proliferation, exception accumulation, and shadow identities (service accounts created in side projects, OAuth apps consented to, federated trusts added to integrate a new SaaS) all bend the curve away from least privilege every week. The mature program treats identity like a code base: reviewed, versioned, tested, and monitored.

On this page

  1. What IAM is in cloud
  2. Identity types
  3. Federation & SSO
  4. RBAC vs ABAC vs ReBAC
  5. Least privilege & permissions analyzers
  6. JIT access & PAM
  7. Workload identity (no more keys)
  8. Non-human identities (NHI)
  9. Secrets in developer workflows
  10. Privilege escalation paths
  11. MFA & phishing-resistant auth
  12. Detection signals for identity abuse
  13. AWS, Azure, and GCP side-by-side
  14. Maturity stages
  15. Common pitfalls
  16. Further reading
  17. FAQ
  18. Where next

What IAM is in cloud

IAM - Identity and Access Management - is the set of systems that answer two questions on every cloud API call: who is this, and are they allowed to do this. The first is authentication (AuthN); the second is authorization (AuthZ). In cloud, both happen at the control plane on every action, not once at the front door, which is the central difference from the on-prem AD-centric model many practitioners came up in.

The on-prem model is roughly: a user logs into a workstation, the workstation gets a Kerberos ticket from a domain controller, the ticket grants implicit trust to most internal resources behind the same firewall. Network position is most of the access decision; identity is a constant inside the perimeter. The cloud model has no perimeter - every call to the AWS, Azure, or GCP API is a fresh authorization decision evaluated against an explicit policy. There is no "behind the firewall"; an IMDS-exposed role abused from a compromised pod talks to the same control plane as the security team's own console session.

The practical implications: every workload needs an identity (no shared service accounts), every policy is auditable (every Allow / Deny is a log line), every change is an event (a new role assumption is detectable in real time), and every credential is interesting to an attacker (long-lived keys leak; short-lived tokens are the only safe pattern). The IAM team's job is to make that surface area legible, narrow, and observable.

Identity types

The first taxonomy to internalize: cloud IAM serves three distinct populations, with different lifecycles, threat models, and tooling.

Human identities

Workforce - employees, contractors, vendor partners. Lives in your corporate IdP (Okta, Entra ID, Google Workspace). Federated into every cloud, SaaS, and internal app. Customer (CIAM) - end users of your product. Auth0, Cognito, Entra External ID, Firebase Auth, Stytch, WorkOS. Different scale (millions of users, social login, account-recovery flows) and different threat model (credential stuffing, ATO). B2B - your customer's IT admins integrating with your product via SSO. SAML / OIDC inbound, SCIM provisioning, the messy world of "every customer wants a different IdP."

Workload identities

Anything that isn't a human but needs to call an API: a pod in Kubernetes, a serverless function, a CI/CD job, an on-prem agent, a third-party SaaS connector, a script running in someone's automation. Historically these used long-lived access keys; the modern pattern is short-lived tokens minted from a platform identity (IRSA, GKE Workload Identity, Azure Workload Identity) or from a federated OIDC trust. The volume here typically exceeds the human population 10-100×; most orgs have an order-of-magnitude underestimate of how many workload identities they have.

Service / principal identities

The cloud-native IAM constructs - AWS IAM users / roles / federated principals, Azure service principals / managed identities, GCP service accounts / workload-identity pools. These are the shape identity takes inside the cloud, regardless of whether the entity behind them is human, workload, or another service. The principal is what policies attach to and what logs reference. Most identity work in cloud is really principal hygiene: who owns this role, what touches it, when was it last used, is it still needed.

The three populations overlap in unintuitive ways. A workforce engineer logs into AWS via SSO (human → principal), pushes code via a CI/CD job that assumes a deploy role (workload → principal), and the deployed service runs as a pod-bound role that calls S3 (workload → principal). Three different identity types, one continuous chain of accountability if the logs and trust relationships are wired correctly. The most common audit gap is that the chain isn't continuous - a CI job runs as a generic shared role, so "who deployed this" can't be answered. Fix the federation, fix the audit.

Federation & SSO

Federation is the pattern that lets a single identity provider authoritatively assert "this is user X" to multiple downstream systems. SSO is the user-visible benefit: one login, many apps. The protocols and the IdP landscape:

SAML 2.0

The grandfather of enterprise federation. XML-based, browser-redirect flow, signed assertions. Still dominant for SaaS-to-SaaS federation in B2B contexts because every IdP and every enterprise app speaks it. Verbose, awkward to debug, vulnerable to subtle bugs in signature validation if you ever roll your own - which you shouldn't.

OIDC (OpenID Connect)

The modern, JSON / JWT-based federation protocol built on OAuth 2.0. Cleaner, mobile-friendly, easier to debug, and - critically for cloud - the protocol cloud providers trust for federating workloads from CI/CD systems, Kubernetes clusters, and on-prem agents. If you're integrating a new system in 2026, prefer OIDC; reserve SAML for legacy partners that don't speak OIDC.

SCIM (System for Cross-domain Identity Management)

The provisioning protocol that complements SAML / OIDC. SAML / OIDC authenticate sessions; SCIM keeps user lifecycles (create / update / deprovision) in sync between the IdP and downstream apps. Without SCIM, you authenticate a user successfully - and the destination app has no idea they exist, or worse, doesn't know they were terminated last quarter. The off-boarding gap is the most expensive consequence of skipping SCIM.

The IdP landscape

One IdP, one source of truth

The architectural decision that pays the most dividends: pick one workforce IdP, federate every cloud and every SaaS through it, and treat that IdP's user lifecycle (joiner / mover / leaver via HRIS-driven SCIM) as the authoritative event stream for access. Multi-IdP environments - usually the result of acquisitions or org-chart politics - produce orphan accounts, missed deprovisions, and security exposure. Consolidate, even if it takes 18 months.

RBAC vs ABAC vs ReBAC

The three authorization models you'll encounter, and where each fits in a cloud environment.

Model How it grants access Best for Watch out for
RBAC Permissions attach to named roles; subjects are assigned roles. Cloud control-plane access, small-to-medium app permissions, anything an auditor needs to read at a glance. Role explosion at scale (1 role per team × per env × per privilege level = thousands of roles).
ABAC Policies evaluate attributes of subject, resource, action, and context - tags, claims, time, IP, MFA state. Scale-out tenant data planes, multi-team cloud accounts using tags to scope access, per-environment guardrails. Harder to reason about; one wrong tag flips the access decision. Strong tagging discipline is a prerequisite.
ReBAC Access is a graph relationship - user is editor of doc, doc is in folder, folder is owned by team. SaaS sharing models (Google Docs / Notion / Figma-style), hierarchical resources, fine-grained app permissions. Newer pattern; tooling (OpenFGA, SpiceDB, Permify, Cerbos) is maturing but less embedded in cloud-native IAM than RBAC / ABAC.

In practice, cloud-native IAM is RBAC at the surface with ABAC primitives layered in. You define roles (RBAC), and within the role's policy you use condition keys / aws:ResourceTag / aud / claims / scope to add attribute-based constraints (ABAC). The pure-RBAC org with thousands of roles is at one failure mode; the pure-ABAC org with one role and 200 condition keys is at the other. Most mature programs land in the middle: a small set of roles per persona, ABAC conditions inside each role to scope to the right resources.

ReBAC belongs inside the application, not in the cloud control plane. If you're building product authorization (who can see which document, which org's data, which row in this multi-tenant table), reach for OpenFGA, SpiceDB, Cerbos, or Permify. AWS Verified Permissions (Cedar) is the AWS-native answer in the same category. Don't try to encode product authorization in IAM policies; that's a path to permission sprawl and customer-visible bugs.

Least privilege & permissions analyzers

"Grant the minimum permissions required" is the most-repeated and least-implemented principle in cloud security. The reasons it's hard at scale: the cloud has thousands of distinct actions, your application uses an unknown subset, that subset changes as code evolves, and "what was used" requires log analysis nobody has time to do manually. The 2026 toolchain finally makes this tractable.

Native permissions analyzers

Third-party permissions / CIEM tools

The right-sizing loop

The operational practice that makes least-privilege real: (1) start a new workload with a deny-by-default role; (2) run it in staging with broad audit logging; (3) use a policy generator or iamlive to derive the minimum permissions actually used; (4) ship that minimum; (5) re-run the analyzer monthly and trim what falls into disuse. The pattern shrinks blast radius continuously without slowing engineers down - the analyzer does the work, the engineer reviews and approves the trim.

JIT access & PAM in cloud

Standing admin access is the highest-impact risk in any cloud environment. The fewer humans who hold standing admin rights, the smaller the blast radius when one gets phished. JIT - just-in-time - access is the pattern that collapses standing privilege to near-zero by making elevated access requestable, approvable, and time-bounded.

What JIT looks like by cloud

PAM (Privileged Access Management) in cloud

Traditional PAM (CyberArk, BeyondTrust, Delinea) is built for credential vaulting in an on-prem world - store the admin password, broker the session, record everything. The cloud-native version is different: there are fewer secrets to vault (because federated SSO replaces passwords), but the brokering / recording / approval workflow still matters for sensitive operations. The 2026 PAM shape:

Workload identity (no more long-lived keys)

Long-lived access keys are the most common source of cloud breaches. Every leaked GitHub commit with an AKIA-prefixed key, every Stack Overflow paste with a service-account JSON, every disgruntled-employee export - long-lived credentials get exfiltrated and reused. The fix is to eliminate them, not to rotate them harder.

Inside the cloud - platform workload identity

Outside the cloud - OIDC trust for CI/CD and on-prem

The pattern that eliminated the last common need for long-lived cloud credentials: every major CI/CD platform now publishes OIDC tokens that AWS / Azure / GCP can trust as federated identities. No secret to leak; the trust is a verifiable JWT signed by the CI provider for that specific job.

The metrics to chase

Two numbers that should both trend to zero: count of IAM users with access keys (every key is potential exposure) and median age of access keys (older keys are likelier to have been exposed). A mature program ends with a handful of break-glass IAM users, every other principal a role / service account / federated identity, and no key older than 30-90 days.

Non-human identities (NHI) - the sprawl problem

"Workload identity" (the previous section) is one slice of a much bigger category: non-human identities - every identity in your environment that isn't a person. The cloud-native workload identities you assign to pods, functions, and VMs are NHIs. So are the OAuth tokens your SaaS apps trade with each other, the API keys your finance team pasted into Zapier two years ago, the GitHub personal access token used by an Ansible runbook, the certificate that lets your CI runner deploy to Kubernetes, the service principal the third-party DLP scanner uses to read every mailbox in Microsoft 365, the agent identity the new internal AI assistant authenticates as, and the long tail of bots, scripts, and integrations no one remembers creating. NHI as a discrete security category exploded between 2023 and 2026 because the volume crossed the threshold where humans could no longer track it manually.

Why NHI is now the dominant identity problem

What counts as an NHI

Cloud-native NHIs

AWS IAM users / IAM roles / instance profiles, Azure service principals / managed identities / federated credentials, GCP service accounts / workload-identity pools, Kubernetes service accounts. The ones the previous section covers - native to the cloud control plane and (mostly) replaceable with short-lived federated tokens.

SaaS-to-SaaS NHIs

OAuth apps consented to in Google Workspace / Microsoft 365 / Slack / GitHub. API keys in Stripe, Datadog, PagerDuty, Snowflake, Salesforce, HubSpot. Webhook signing secrets. The Okta/Cloudflare incident, the Microsoft Midnight Blizzard test-tenant compromise, the Dropbox Sign breach, and most "third-party SaaS got popped, blast radius hit us" incidents live in this category.

Workload & agent NHIs

CI/CD tokens (GitHub PATs, GitLab deploy tokens, CircleCI keys), bot accounts in chat / ticketing / on-call, RPA bots, internal scripts, scheduled jobs, AI agent identities. High-volume, high-churn, low-visibility - exactly the population legacy IGA tools were not built for.

The recent breach pattern

The 2023-2026 incident catalogue is heavy on NHI-origin breaches: Cloudflare's October 2023 incident traced to leaked Okta service tokens; Microsoft's Midnight Blizzard compromise via a legacy OAuth app with elevated mailbox scopes; the Internet Archive's 2024 breach via an exposed GitLab token; Dropbox Sign via a compromised production service account; New York Times source-code exposure via a GitHub PAT; Cloudflare again via Atlassian tokens stolen in the Okta support-case incident; and a steady drumbeak of TruffleHog / detect-secrets scans finding credentials in public repos. The common thread: an NHI was created for a real reason years ago, never reviewed, the secret leaked through one of N copies, and the blast radius was wide because nobody had scoped the original grant.

The NHI program - what "owning this" looks like

  1. Inventory. Across every cloud account, every SaaS, every code repo, every secret vault, every CI system. The first run of an NHI inventory typically surprises the security team with an order-of-magnitude higher count than expected.
  2. Ownership. Every NHI gets a named human owner and a documented purpose. No owner = candidate for deletion after a deprecation window.
  3. Classify by risk. What can this NHI do, and against what data? An NHI with mail.read on every mailbox in your tenant is critical; one that posts to a single Slack channel is not.
  4. Replace static secrets with short-lived tokens wherever the platform supports it. Cloud workloads → workload identity. CI/CD → OIDC trust (see the previous section). SaaS-to-SaaS where the provider supports OIDC or mTLS instead of API keys → use it.
  5. Vault the rest. The static keys that can't be eliminated belong in a secrets manager with rotation policies, access audit, and break-glass procedures. No more .env in Notion.
  6. Rotate on a schedule and on signal. Time-based rotation for everything; immediate rotation on owner-change, project-end, or any leak signal.
  7. Detect anomalous NHI behavior. Volume spikes, new source ASNs, unusual API calls, dormant-then-active patterns, new OAuth consents on high-scope SaaS - same detection playbook as human identities but with different baselines (NHIs are usually deterministic; humans aren't).
  8. Off-board on every project / employee exit. Treat NHI deprovisioning as part of every leaver workflow and every project sunset. The orphan service account is the most-reused initial-access vector in this whole category.

NHI security vendors

The category has matured fast; expect consolidation through 2026-2027.

The AI-agent wrinkle

AI agents are NHIs that take non-deterministic actions. A traditional service account does the same five API calls every hour; an agent might call any tool in its toolset depending on what the model decides. That breaks the usual anomaly-detection baselines - "this NHI just called an API it has never called before" is a normal Tuesday for an agent. The 2026 patterns emerging: scope each agent's credentials to the smallest possible toolset (per-agent service accounts, not one shared "AI" role), log every tool invocation with the user context that initiated it, route high-impact actions through a human approval step, and treat agent prompts as untrusted input that can exfiltrate or escalate via tool calls. Frameworks like OWASP LLM Top 10 and the emerging NIST AI risk-management guidance cover the threat model. See also AI/ML Security.

Secrets in developer workflows - IDEs, CLIs, Terraform, DevOps

Secrets managers and vaults are only as useful as the workflows that consume them. Identity is the auth story: every retrieval from a vault should authenticate with the developer's federated identity or the workload's platform identity - never with a static credential to fetch the other secrets. This section covers the integration patterns for the surfaces developers actually touch every day. For the underlying secrets-manager capabilities (rotation, dynamic secrets, versioning, audit), see Data Security & KMS - Secrets management; for pipeline-specific patterns see CI/CD - Pipeline secrets; for cluster patterns see Kubernetes - Cluster secrets.

Local development & IDEs - kill the .env file

The most common leak path on this whole page: a developer pastes a production API token into .env, the file gets committed, the secret-scanner catches it three commits later, the rotation cycle begins. The fix is to never let a static secret land on disk in the first place. The pattern: a small wrapper authenticates the developer to the vault using their SSO identity, fetches the secret at process-start, exports it to the child process's environment, and never persists it.

Terraform & OpenTofu - dynamic credentials and vault-sourced inputs

Two distinct concerns: (1) how Terraform itself authenticates to the cloud provider, and (2) how Terraform retrieves secret values it needs to pass to resources. Both have mature answers that avoid static secrets.

Authenticating Terraform to the cloud

Pulling secret values into Terraform configurations

DevOps tooling - Ansible, Pulumi, Helm, scripts

The decision tree

  1. Can the platform mint a short-lived credential via workload identity or OIDC? Use that. (Cloud SDK on EC2 / AKS / GKE, OIDC trust for CI/CD, TFC dynamic provider credentials.) No secret to manage is the goal state.
  2. If not, can the secret be retrieved on-demand from a vault using the caller's identity? Use that. (External Secrets Operator, Vault Agent, AWS/Azure/GCP SDK against their native managers.)
  3. If a static value must live somewhere, is it in a secrets manager with rotation, audit, and access control? Use that. Never an env var checked into config, never a plaintext file on disk, never in chat.
  4. Pre-commit, pre-push, and CI scanners are the safety net, not the strategy. TruffleHog, Gitleaks, GitHub push protection, GitLab Secret Detection - all should be enabled, but if they're catching anything regularly, the workflow upstream needs fixing.

Common cloud privilege escalation paths

The defender's view of how attackers move from a low-privilege foothold to admin in cloud. The pattern names change by provider; the shape repeats.

AWS

GCP

Azure

Cross-cloud and federation

For attack-path tooling beyond manual review, BloodHound (Cloud), PMapper, Pacu, and the CIEM tools listed earlier all visualize these paths automatically. Run them in offense mode against your own environment quarterly.

MFA & phishing-resistant authentication

"MFA enabled" is no longer a meaningful security control by itself. The factor type matters more than the presence of a second factor.

The factor ladder

Conditional access

The factor is one input to the access decision; context is the other. Conditional access policies (Entra Conditional Access, Okta Adaptive MFA, Google BeyondCorp, Cloudflare Zero Trust) evaluate device posture, location, network, time, sign-in risk score, and recent behavior - and either allow, step-up (require an additional factor), or block. The shape of a mature policy: only managed compliant devices, only from expected geographies, step-up for admin scopes, block from anonymizing networks (Tor, known VPN exit nodes that match attacker telemetry).

Session binding

The 2026 attack pattern that bypasses MFA is session-cookie theft via infostealer (Lumma, RedLine, etc.). The user does FIDO2 successfully; the malware exfiltrates the resulting session cookie; the attacker reuses it from anywhere. Defenses: device-bound session credentials (DBSC), token-binding, continuous-access evaluation that revokes sessions on risk-score changes, and short session lifetimes for sensitive scopes. Expect session-theft to dominate the next round of cloud incidents until DBSC and equivalents are universally deployed.

Detection signals for identity abuse

The detections every cloud SOC should have running. None of these are exotic; the failure mode is rarely "we couldn't see it" - it's "we never wrote the rule." (See also: Cloud SOC.)

High-signal detections

Where the detections live

Feed CloudTrail (AWS), Activity Log + Sign-in Logs (Azure), and Cloud Audit Logs (GCP) plus the IdP audit log (Okta System Log, Entra Sign-in / Audit logs, Google Workspace Reports) into one of: a SIEM (Splunk, Elastic, Sumo Logic, Sentinel, Chronicle), a detection-engineering platform (Panther, Anvilogic, Hunters), or the cloud-native detection stack (GuardDuty + Detective, Defender for Cloud, Security Command Center + Chronicle). The detections above translate to all of them; the rule libraries (Sigma, Splunk SPL, KQL, YARA-L) are widely shared.

Dense, backlit server-room cabling
Photo by Brett Sayles on Pexels

AWS, Azure, and GCP side-by-side

The native IAM constructs each cloud ships, reduced to a one-screen reference:

Capability AWS Azure GCP
Workforce SSO IAM Identity Center (federated to Okta / Entra / Google) Entra ID (native), federate external IdPs Cloud Identity / Workforce Identity Federation
Authorization model Identity-based + resource-based policies, JSON Azure RBAC + Azure Policy, role definitions in JSON IAM bindings (role-based) + IAM Conditions
Workload identity (VMs / functions) IAM roles + IMDSv2 Managed Identities (system / user-assigned) Service accounts attached to compute
Kubernetes workload identity IRSA / EKS Pod Identity Azure Workload Identity (AKS) GKE Workload Identity
External workload federation IAM OIDC identity providers + AssumeRoleWithWebIdentity Entra federated credentials on app registrations Workload Identity Federation pools & providers
JIT / privileged access IAM Identity Center session limits + third-party (ConductorOne, Opal, Sym, Teleport) Entra Privileged Identity Management (PIM) Privileged Access Manager + conditional IAM
Permissions / unused-access analysis IAM Access Analyzer (external + unused + policy gen) Entra Permissions Management (formerly CloudKnox), Access Reviews IAM Recommender, Policy Analyzer, Policy Troubleshooter
Org-level guardrails Service Control Policies, Resource Control Policies, Declarative policies Azure Policy + Initiatives at mgmt-group scope Organization Policy constraints
Activity audit log CloudTrail (org trail) Activity Log + Entra Audit / Sign-in Logs Cloud Audit Logs (Admin / Data Access)
Identity threat detection GuardDuty (IAM & S3 protection), Detective Entra ID Protection, Defender for Identity, Defender for Cloud Security Command Center, Event Threat Detection, Chronicle
Phishing-resistant MFA FIDO2 security keys + passkeys via IAM Identity Center / IdP FIDO2 + passkeys + Windows Hello via Conditional Access FIDO2 + Titan keys + passkeys via Google Workspace
Break-glass & root Root user - store credentials sealed; CloudTrail-alert on use Global Admin emergency accounts (2+), excluded from MFA Conditional Access carefully Super Admin emergency accounts, gcloud-restricted access

The native tools are necessary but rarely sufficient at scale. Most orgs running multi-cloud reach for a CIEM (Wiz, Sonrai, Permiso, ConductorOne, Veza) to get a cross-cloud effective-permissions view; the native tools are best-in-class for their own cloud and weakest at "show me everything one principal can do across our entire estate."

Maturity stages

A useful staging model for a cloud IAM program:

Crawl - Federated & inventoried

One IdP for all humans, federated into every cloud. SCIM-driven joiner / mover / leaver. Long-lived access keys inventoried and aged out. MFA enforced (any factor) on all admins. CloudTrail / Activity Log / Cloud Audit Logs centralized. A first inventory of roles and service accounts exists.

Walk - Least privilege at rest

Permissions analyzers running monthly; unused permissions trimmed. CIEM or equivalent visualizes effective permissions across accounts. Workload identity adopted for most net-new workloads; access keys trending toward zero. Phishing-resistant MFA mandatory for admins. Basic identity-abuse detections firing into the SOC.

Run - JIT & engineered

Standing admin reduced to near-zero via JIT. SCPs / Organization Policies / Azure Policy guardrails block known-bad patterns at the API. OIDC trust for all CI/CD; no long-lived secrets in pipelines. Identity-threat detections (impossible travel, unused-then-used, escalation primitives) all running. Quarterly attack-path review with BloodHound / PMapper / CIEM.

Fly - Identity is a product

Identity team treats internal IAM as a paved-road service: self-service role requests, automated right-sizing, drift detection blocking PRs in IaC. Customer-facing identity (CIAM / B2B SSO) is differentiated as a competitive feature. Continuous-access evaluation, device-bound sessions, and just-enough-access reduce blast radius to single API operations.

The skip-stage cost is real here - JIT without federated SSO produces a Frankenstein of half-controls; CIEM without an IdP inventory produces noise without action. Sequence matters.

Common pitfalls

Further reading

Specifications & standards

Provider docs

Attack & offense research

Related CSOH pages

FAQ

Why is IAM the #1 root cause of cloud breaches?

Every cloud action is an authenticated API call, so identity is the perimeter. When an attacker compromises a credential - a leaked access key, a phished session cookie, a misconfigured OIDC trust - they inherit whatever permissions that identity holds, and most identities hold far more than they need. The Verizon DBIR, Mandiant M-Trends, and IBM Cost of a Data Breach reports all converge on the same finding: stolen or abused credentials are the top initial-access vector in cloud, ahead of vulnerability exploitation and phishing-of-endpoints. The fix is not a single product; it's a program - strong authentication for humans, short-lived tokens for workloads, least privilege enforced continuously, and detections that fire on identity anomalies.

What's the difference between RBAC, ABAC, and ReBAC?

RBAC (role-based) grants permissions to named roles, and assigns subjects to roles - simple, auditable, but combinatorially expensive at scale. ABAC (attribute-based) grants permissions through policies that evaluate attributes of the subject, resource, action, and context (e.g., "engineers can read S3 buckets tagged team=their-team in us-east-1 during business hours") - flexible, scales without role explosion, but harder to reason about. ReBAC (relationship-based, popularized by Google Zanzibar / OpenFGA / SpiceDB) models access as graph relationships ("user is editor of document, document is in folder, folder is owned by team") - fits SaaS sharing models and hierarchical resources. Most cloud platforms are RBAC at the surface (roles, policies) with ABAC primitives (tag / condition keys / IAM conditions) layered in. Use RBAC for cloud-control-plane access; ABAC for scale-out tenant data-plane decisions; ReBAC inside applications that model sharing.

How do I get rid of long-lived cloud access keys?

For humans: federate through your IdP (Okta, Entra ID, Google Workspace) into IAM Identity Center / Entra-based federation / Workforce Identity Federation. Humans get short-lived credentials via SSO; static keys go away. For workloads in cloud: use the native workload-identity primitives - IAM Roles for EC2 / Lambda, IRSA / EKS Pod Identity for Kubernetes on AWS, Managed Identities / Azure Workload Identity for Azure, GCE / GKE Workload Identity for GCP. For workloads outside cloud (CI/CD, on-prem agents, third-party SaaS): use OIDC trust - GitHub Actions, GitLab, CircleCI all publish OIDC tokens that AWS / Azure / GCP can trust without any pre-shared secret. Rotate the remaining long-lived keys aggressively, alert on creation, and track them to zero. The metric to chase is "count of access keys older than N days" trending toward zero.

Is MFA enough for cloud admin accounts?

MFA is necessary but not sufficient, and the type of MFA matters more than the presence of it. SMS-based codes are phishable and SIM-swappable. TOTP (authenticator apps) is phishable through real-time relay attacks (Evilginx-class kits). The bar for cloud admins in 2026 is phishing-resistant authentication - FIDO2 security keys, platform passkeys, or device-bound credentials via WebAuthn. Pair that with conditional access (only from managed devices, only from expected geographies, step-up auth for sensitive actions) and session-binding (tokens tied to the device that minted them). Treat "MFA enabled" as a compliance floor; treat "phishing-resistant for everyone with prod cloud access" as the actual security target.

What is iam:PassRole and why does it matter?

iam:PassRole is the AWS permission that lets a principal hand a role to a service to assume on its behalf. If you can pass a powerful role to a service you control (Lambda, EC2, SageMaker, Glue, CloudFormation, dozens more), you can effectively assume that role's permissions through the service. It's the single most-abused privilege escalation primitive in AWS - the equivalent of "I can't read this S3 bucket, but I can launch an EC2 instance with a role that can." The defense: scope iam:PassRole tightly using the iam:PassedToService condition, restrict which roles can be passed by ARN, and audit it with IAM Access Analyzer's unused-permission and external-access analyzers. Azure's analog is the ability to assign a managed identity with elevated rights; GCP's is the iam.serviceAccounts.actAs permission. The pattern repeats across all three clouds; the names differ.

What does just-in-time (JIT) access look like in cloud?

JIT access means standing permissions are minimal, and elevated permissions are requested per-need with an expiration. The mechanics vary by cloud and platform: AWS via IAM Identity Center permission sets + approval workflows or via tools like ConductorOne / Tenable Cloud Security / Sym; Azure via Entra PIM (Privileged Identity Management) with eligibility, activation, MFA-on-activation, and time bounds; GCP via temporary IAM grants with conditions or Privileged Access Manager. The audit trail captures who requested, who approved, why, and when access expired. The operational benefit is that the breach blast-radius from a compromised user is small most of the time - they only have elevated privileges during the window they actively requested. The maturity step beyond JIT is just-enough-access: even during the window, only the specific actions needed, not a generic admin role.

What are non-human identities (NHIs) and why are they a cloud security problem?

Non-human identities (NHIs) are every identity that isn't a person - service accounts, API keys, OAuth tokens, secrets, certificates, workload roles, third-party SaaS integration credentials, bots, and now AI agents. In a typical 2026 cloud estate, NHIs outnumber humans 45-100 to 1, almost none are tied to a clear owner, most have no defined lifecycle, and a meaningful percentage are over-permissioned. They are the dominant blast-radius surface and have driven a string of recent breaches - Cloudflare via leaked Okta service tokens, Microsoft via a legacy test OAuth app abused by Midnight Blizzard, Internet Archive via an exposed GitLab token, Dropbox Sign via a compromised service account, and the long tail of GitHub leaked AWS access keys. The 2026 program covers four things: (1) inventory every NHI across cloud, SaaS, and code, (2) assign each one an owner and a purpose, (3) put every secret in a vault with rotation and short-lived alternatives where possible (OIDC trust, workload identity), (4) detect anomalous NHI behavior - new consents, unusual call patterns, dormant-then-active. The category has a maturing vendor space: Astrix, Entro, Oasis Security, Aembit, Token Security, Clutch, Britive, Natoma, Andromeda Security.

How do I detect identity abuse in cloud?

The high-signal detections to build first: (1) console login or AssumeRole from a new ASN / country / device for a privileged identity; (2) GetCallerIdentity called by a long-lived access key from an unusual source - the "who am I, where are my permissions" reconnaissance step is reliable post-compromise behavior; (3) impossible-travel between sessions; (4) MFA-disable, password-reset, or new-access-key events on privileged accounts; (5) AssumeRole chain length anomalies (one role assumed via another via another); (6) new OAuth app consent grants in your IdP; (7) iam:PassRole / iam:CreateAccessKey / iam:CreateLoginProfile / iam:AttachUserPolicy / sts:GetFederationToken from non-admin paths; (8) service-account-key creation in GCP, federated-credential addition in Azure, IAM-user creation in AWS - all rarer than zero in a mature org, all high-value to alert on. Feed CloudTrail / Activity Log / Cloud Audit Logs into a SIEM or detection-engineering platform (Panther, Anvilogic, SnapAttack, native Sentinel) and tune from there.

Where next