The 30-second version: Threat modeling is the practice of reasoning out loud, in writing, about what can go wrong with a system before an attacker does it for you. It is not a checklist, not a tool, and not a compliance artifact - though all three exist around it. The classic framing is Adam Shostack's four questions: What are we building? What can go wrong? What are we going to do about it? Did we do a good job?
The practitioner's stack: STRIDE as the day-to-day categorization, PASTA for high-stakes risk-quantified modeling, LINDDUN for privacy, attack trees for goal-oriented analysis, and MITRE ATT&CK Cloud as the threat library. Cloud changes the model in five ways: identities everywhere, the control plane is the attack surface, shared responsibility shrinks scope, multi-tenancy is an assumption, and default stances differ per cloud.
On this page
- What threat modeling is (and isn't)
- Shostack's four questions
- STRIDE
- STRIDE-per-element vs per-interaction
- PASTA
- LINDDUN (privacy)
- Attack trees
- Kill chains & ATT&CK
- OWASP Threat Dragon & Microsoft TMT
- Commercial platforms
- Where it fits in the SDLC
- The agile threat model
- What's different for cloud
- Worked example: 3-tier AWS app
- Worked example: LLM RAG app
- Worked example: multi-account landing zone
- Threat libraries vs custom
- Risk scoring threats
- Threat modeling as code
- AI in threat modeling
- Common pitfalls
- Maturity stages
- Further reading
- FAQ
What threat modeling is (and isn't)
Threat modeling is a structured way to reason about what can go wrong with a system. It produces a written artifact - a list of threats, ranked, with mitigations and owners - that future versions of you, your team, and the auditor can read. It is most valuable when applied during design, when the cost of changing an architectural decision is still close to zero.
It is not a checklist. Checklists answer "did you do X?" Threat modeling asks "given what we are building, what is the set of Xs we should worry about?" - which is a different and earlier question. A checklist that came out of a threat model is a useful artifact; a checklist that replaces the modeling is dangerous, because it bakes in someone else's threat assumptions.
It is not a tool. Threat Dragon, Microsoft TMT, IriusRisk, and a dozen others can support the work, but the work itself is human reasoning over a system diagram. A team without tooling that runs a 30-minute STRIDE session every two weeks will out-deliver a team with a $50,000 platform that never books the meeting.
It is not a compliance artifact, though SOC 2, ISO 27001, PCI DSS, FedRAMP, and most modern frameworks now reference threat modeling explicitly. Doing it only for the auditor produces compliance theater - a 200-page document everyone signs once, nobody updates, and which fails its first contact with a real attacker. Do it because it makes your system safer; the audit evidence follows.
It is not a one-time event. Systems evolve; trust boundaries shift; new third-party integrations introduce new edges. A model from two years ago describes a system that no longer exists. The discipline is in re-running it at every meaningful design change.
Shostack's four questions
Adam Shostack - author of Threat Modeling: Designing for Security, the field's defining textbook - reduced the practice to four questions. Almost every methodology since maps to one of them.
1. What are we building?
Draw the system. A data-flow diagram with components, data stores, external entities, and the edges (interactions) between them. Mark the trust boundaries - the dashed lines where data crosses from one zone of trust to another. The diagram is the foundation; vague diagrams produce vague models.
2. What can go wrong?
Apply a categorization - STRIDE, PASTA, LINDDUN, attack trees, or a custom library - to enumerate threats. Walk the diagram element-by-element or interaction-by-interaction. Don't filter for plausibility yet; the goal is breadth, then ranking.
3. What are we going to do about it?
For each threat, decide: mitigate (add a control), transfer (insurance, contract, third-party), accept (risk decision with an owner), or avoid (don't build the thing). The output is mitigations with owners and target dates.
4. Did we do a good job?
Validation. Did the mitigations ship? Did peer reviewers find threats you missed? Did pentests or red-team exercises confirm the threats you ranked were the right ones? The fourth question is the one most teams skip; it's also the one that turns threat modeling into an evolving discipline rather than a one-shot ritual.
Skip any one of the four and the work degrades. Skip question one and you'll find threats in the imaginary architecture, not the real one. Skip question three and you've cataloged risks rather than reduced them. Skip question four and you'll never know whether you got better.
STRIDE
STRIDE is Microsoft's six-category mnemonic for the kinds of threats that systems face. Each letter maps to a security property that the threat violates. It is the most-taught, most-used, and most-tractable categorization scheme - a 30-minute team can produce a useful threat list with nothing more than STRIDE and a diagram.
| Letter | Threat category | Violates | Cloud-specific example |
|---|---|---|---|
| S | Spoofing | Authentication | Forged JWT, stolen IAM role credentials, IMDSv1 SSRF impersonating the workload identity, OIDC trust-policy misconfiguration trusting any GitHub repo. |
| T | Tampering | Integrity | S3 object overwrite without versioning, container image swap before pull, IaC plan tampering between plan and apply, RDS audit log mutation. |
| R | Repudiation | Non-repudiation | CloudTrail disabled in one region, KMS key used without logging, shared IAM user accessed by multiple humans, log-bucket retention disabled by the attacker. |
| I | Information disclosure | Confidentiality | Public S3 bucket, error message returning stack trace + secrets, secrets in env vars exposed via SSRF, blob storage SAS URL leaking customer data. |
| D | Denial of service | Availability | Cloud-bill DoS via misconfigured auto-scaling, billing-account suspension after fraud signals, exhaustion of API quotas, KMS key rotation breaking dependent services. |
| E | Elevation of privilege | Authorization | iam:PassRole abuse, AssumeRoleWithWebIdentity to a more-privileged role, lateral movement via shared KMS keys, GuardDuty bypass via cross-account trust. |
The mnemonic is a generative tool, not a comprehensive ontology. STRIDE doesn't explicitly cover privacy (that's LINDDUN), supply chain (that's a category in its own right), or business-logic abuse (which often lives at the application layer below the threat-model abstraction). Use STRIDE as the starting categorization and add domain-specific lenses where the system warrants it.
STRIDE-per-element vs STRIDE-per-interaction
There are two ways to apply STRIDE against a data-flow diagram. They produce different threat lists and different costs.
STRIDE-per-element
For each element on the diagram - external entity, process, data store, data flow - apply the STRIDE categories that can affect that element type. The Microsoft Threat Modeling Tool encodes which categories apply to which element types out of the box: external entities can be Spoofed and can Repudiate; processes can suffer every STRIDE class; data stores can be Tampered with, leak Information, suffer DoS, but not generally Spoofed; data flows can be Tampered with, leak Information, and suffer DoS. The mechanical traversal makes per-element fast and predictable; it tends to under-find at boundaries.
STRIDE-per-interaction
For each edge (interaction) on the diagram - every arrow that crosses a trust boundary - apply the full STRIDE set. Per-interaction produces more threats and surfaces the cross-boundary problems STRIDE was originally designed to catch (the trust-boundary crossing is where most cloud threats actually live). It costs more time and demands more practitioner skill to keep the list useful.
The pragmatic default: STRIDE-per-element for fast, repeatable team sessions; STRIDE-per-interaction for new architectures, regulated workloads, or anything where the cost of a missed threat is high. Many mature teams run per-element on a quarterly cadence and per-interaction once at design and after any major architectural change.
PASTA - Process for Attack Simulation and Threat Analysis
PASTA is a seven-stage risk-centric methodology developed by Tony UcedaVélez and Marco Morana. Where STRIDE answers "what kinds of threats apply here," PASTA answers "given the business objectives, the actual technology stack, and credible attacker capabilities, what is the prioritized risk picture, and how should we treat it?" It produces a heavier artifact for higher-stakes systems.
| Stage | Activity | Output |
|---|---|---|
| 1 | Define business objectives | What the system exists to do; regulatory and contractual constraints; risk tolerance. |
| 2 | Define technical scope | Application boundaries, dependencies, third-party integrations, cloud services in scope. |
| 3 | Decompose the application | Data-flow diagrams, trust boundaries, identity model, data classification per element. |
| 4 | Threat analysis | Enumeration of credible threats - often using STRIDE, CAPEC, or ATT&CK as input libraries. |
| 5 | Vulnerability analysis | Map threats to existing weaknesses - known CVEs, misconfigurations, design flaws, control gaps. |
| 6 | Attack modeling | Attack trees, attack libraries, simulated kill chains. The "could an attacker actually do this?" stage. |
| 7 | Risk and impact analysis | Scored risks (likelihood × impact, or quantified via FAIR), treatment plan, residual risk register. |
PASTA's reputation as "heavyweight" is partly fair. The seven stages are days of work, not minutes. The payoff is a defensible risk artifact - one that a CISO can take to a board, a regulator can read, and an internal audit team can re-perform against. For most workloads, STRIDE is enough; PASTA earns its keep on regulated, mission-critical, or revenue-bearing systems where the stakes justify the cost.
LINDDUN - privacy threat modeling
LINDDUN is the privacy counterpart to STRIDE. Developed at KU Leuven, it categorizes privacy threats - the harms that arise from how a system handles personal data - that security-focused methodologies tend to miss. With GDPR (EU), CCPA / CPRA (California), LGPD (Brazil), PIPL (China), and a growing list of jurisdictional regimes, LINDDUN's relevance has expanded sharply.
| Letter | Threat category | Cloud example |
|---|---|---|
| L | Linkability | Two anonymized analytics events that, together, identify a user (cookie + IP + UA). |
| I | Identifiability | A "pseudonymous" identifier (hashed email) that's reversible via rainbow tables or a join. |
| N | Non-repudiation | Whistleblower or activist data linked to a real identity by audit log retention that exceeds the privacy promise. |
| D | Detectability | The presence of a user record being inferable even without reading it (e.g., conditional access patterns). |
| D | Disclosure of information | Confidential personal data leaking to unauthorized parties (overlaps STRIDE's I). |
| U | Unawareness | Users not informed how their data is processed; missing consent records or transparent notices. |
| N | Non-compliance | Processing that violates a specific regulatory obligation - retention beyond stated limits, cross-border transfer without lawful basis. |
Run LINDDUN alongside STRIDE on any system that processes personal data at scale. The categorizations partially overlap (Disclosure of information appears in both) but address different audiences - STRIDE for the security team, LINDDUN for the privacy office and the data-protection officer. A combined STRIDE + LINDDUN session is a strong default for any consumer-facing or healthcare workload.
Attack trees
Attack trees are Bruce Schneier's contribution - a goal-oriented enumeration of how an attacker could achieve an objective. The root of the tree is the attacker's goal; the branches are sub-goals; the leaves are concrete actions an attacker would take. AND-nodes require all children; OR-nodes require any one.
Where STRIDE walks the defender's view of the system asking "what can go wrong at each element?", attack trees walk the attacker's view asking "given a goal, what paths lead to it?" The two are complementary; many practitioners use STRIDE to generate threat candidates and then build attack trees for the highest-priority ones to enumerate the actual paths.
Worked sketch - goal "exfiltrate customer PII from the database":
- OR Obtain valid database credentials
- Steal application-tier IAM role credentials (SSRF, vulnerable workload)
- Leaked secret in source repo
- Phish a database administrator
- OR Abuse the cloud API to read the database directly
- Compromise a privileged IAM principal with rds:* permissions
- Confused-deputy via a backup-restore role that targets attacker-controlled storage
- Cross-account assume-role chain ending at a snapshot-export role
- OR Compromise the application tier and exfiltrate via legitimate data flows
- Vuln in an unpatched dependency
- Container escape from a co-tenant workload
- Supply-chain compromise of an application dependency
The value of writing this down: each leaf maps to a specific mitigation (or absence thereof), and an attacker only needs one leaf to succeed. The defender needs all the leaves' parent OR-branches cut. Attack trees make that asymmetry concrete in a way STRIDE alone doesn't.
Kill chains, ATT&CK, and threat libraries
Categorization frameworks (STRIDE, LINDDUN) answer "what kinds of bad things can happen?" Kill-chain models answer "what is the typical sequence an attacker follows to do them?" - useful for connecting design threats to detection and response controls (see Cloud SOC and Detection Engineering).
Lockheed Martin Cyber Kill Chain
The original seven-stage model: Reconnaissance → Weaponization → Delivery → Exploitation → Installation → Command & Control → Actions on Objectives. Useful as a teaching scaffold; criticized in cloud contexts because cloud breaches frequently skip stages (a leaked access key collapses several stages into one API call).
MITRE ATT&CK and ATT&CK Cloud Matrix
MITRE ATT&CK is a comprehensive threat library - tactics (the why), techniques (the how), and sub-techniques (the specifics) - built from real-world adversary behavior. The Cloud Matrix is the cloud-specific lens, with sub-matrices for AWS, Azure, GCP, Microsoft 365, Google Workspace, and SaaS. Most modern threat modeling references ATT&CK techniques by ID (T1078, T1098, T1136, etc.) so that designers, detection engineers, and red teams share vocabulary.
Other libraries
- CAPEC - MITRE's catalog of attack patterns. More design-oriented than ATT&CK (which is operations-oriented).
- MITRE D3FEND - the defender-side counterpart, mapping countermeasures to ATT&CK techniques. Use it to validate that you have at least one detective or preventive control per high-priority technique.
- ATT&CK for Containers, ATT&CK for ICS, MITRE ATLAS (ML-specific) - domain-specific extensions.
- OWASP Top 10 (web), OWASP API Top 10, OWASP LLM Top 10 - focused, well-known starting points for app, API, and AI workloads.
OWASP Threat Dragon & Microsoft Threat Modeling Tool
OWASP Threat Dragon
OWASP Threat Dragon is the open-source, cross-platform threat modeling tool. Web or desktop; diagrams as JSON; STRIDE-per-element built in; git-friendly storage so threat models can live next to the code. Strengths: free, vendor-neutral, scriptable, good enough for the 80% case. Limitations for cloud: it's element-and-flow generic - there's no cloud-service-aware threat library, so you bring the cloud knowledge yourself.
Microsoft Threat Modeling Tool (TMT)
Microsoft TMT is the canonical STRIDE tool, free and Windows-only. Strengths: an excellent stencil library, automatic threat generation per STRIDE-per-element rules, and good reporting. Limitations: Windows-only, Azure-flavored stencils (you can use it for AWS or GCP, but you'll fight it), and the threat library is generic.
Both tools share a common gap for cloud: the generated threat lists are categorically correct (STRIDE) but lack the cloud-specific specificity that makes a threat model actionable. A cloud-aware practitioner has to enrich the output with the IMDS, IAM, OIDC, KMS, and shared-responsibility-edge threats the generic library doesn't surface.
Commercial threat modeling platforms
The commercial layer adds cloud-aware threat libraries, integration with CI/CD and ticketing, and team-collaboration features that the open-source tools lack.
- IriusRisk - questionnaire-driven model building, large built-in threat library mapped to CAPEC and ATT&CK, integrations with Jira, GitHub, GitLab. Strong for organizations that want repeatable models across many teams without each team needing deep threat-library knowledge.
- ThreatModeler - visual modeling with cloud-aware stencils (AWS, Azure, GCP) and threat generation tied to the components used. Good for teams already comfortable with diagram-first workflows.
- SecuriCAD (foreseeti, now part of Google) - attack-graph-based modeling that simulates probabilistic attacker paths against a model. Different paradigm: closer to attack-graph analysis than STRIDE enumeration.
- Cyversity, Devici, Tutamantic - newer entrants with AI-assisted threat generation, diagram-from-IaC import, and tighter dev-loop integration.
- Cloud-provider threat-modeling helpers - AWS published Threat Composer (open-source); the AWS Security Workshops include threat-modeling tracks. Azure offers built-in TMT templates; GCP has guidance but no first-party tool.
The selection criterion that matters: which tool produces threat lists your team will actually act on. A platform with a deep threat library nobody trusts produces dust-collecting models. A simpler tool that's been adopted by the team produces shipped mitigations.
Where threat modeling fits in the SDLC
The most expensive moment to find an architectural threat is after the architecture is implemented. The cheapest is at the whiteboard. The SDLC placements that earn their cost:
- Design review. The original target. Threat model before the architecture decision record (ADR) is approved. Most threats found here cost minutes to fix - change a trust boundary, choose a different service, scope a role differently.
- Pre-implementation security review. When the team has detailed design but no code yet. Re-confirm the model matches what's actually being built.
- Pre-production / pre-launch. Final review before the system goes live. Validates the implementation against the design-time model; catches the gap between "what we said we'd build" and "what we built."
- Post-incident. After any meaningful security event, re-threat-model the affected component with the new attacker behavior as input. The post-incident model becomes durable institutional knowledge.
- Periodic refresh. Quarterly for high-criticality systems, annually for the rest. Confirms the diagram still reflects reality and the threats still rank the same way.
Threat modeling at any single stage helps; threat modeling across all five is how mature programs operate. The opportunity cost of one missed iteration is generally small; the opportunity cost of never threat modeling a critical system is what shows up in the post-incident report.
The agile threat model
The classic threat-modeling failure mode is the 200-page artifact: produced once, reviewed once, never updated, and quietly irrelevant by the time the system has shipped. The agile threat model is the antidote.
The mechanics: 30 minutes on a whiteboard with the engineer who knows the system, a security partner, and a product / ops person who knows the data sensitivity. Draw the data-flow diagram. Walk STRIDE per element. Write threats on sticky notes. End the session with a list of mitigations, each with an owner and a target date. Commit the diagram and the threat list to the repo. That is the artifact.
Re-run the session whenever the trust boundary changes - new identity type, new third-party integration, new data classification, new region. Re-run for new features whose blast radius the team isn't sure about. Skip the formality when the change is genuinely cosmetic. The discipline is in doing it often, not in doing it exhaustively.
The Threat Modeling Manifesto codifies the spirit: people and collaboration over processes, methodologies, and tools; a journey of understanding over a security or privacy snapshot; a tool to expose dangers over a means to compliance.
What's different for cloud
The methodology survives the move to cloud unchanged. The threats it surfaces shift in five practical ways.
Identities everywhere
Every workload has an identity (instance profile, managed identity, service account); every human has multiple (corporate SSO, cloud console, federated roles); every service-to-service call carries one. Identity is the new perimeter and most cloud threats involve an identity primitive somewhere. STRIDE Spoofing and Elevation of Privilege dominate the cloud threat list. See IAM & Identity.
Control plane = attack surface
The cloud API is itself an attack surface. A traditional threat model worried about network-layer threats to a service; a cloud threat model worries about API-layer threats to the service and to the cloud control plane that manages it. CreateAccessKey, AssumeRole, PutBucketPolicy, and similar API actions are first-class threat surfaces.
Shared responsibility
Some threats are already mitigated by the provider - physical access, hypervisor escape (mostly), DDoS scrubbing for managed front-doors. The model should explicitly mark which threats are out-of-scope because the provider owns them, and which are still yours. See Shared Responsibility.
Multi-tenancy assumptions
Every managed service is multi-tenant under the hood. The model must state explicitly what tenancy guarantees you rely on (per-account isolation, per-VPC isolation, KMS key-policy enforcement) and what would break if the provider's isolation model surprised you (extremely rare in practice, but the threat is non-zero - see breach reviews like LastPass and the GCP DataStudio class of disclosure).
Default stances differ
AWS defaults to deny on IAM, permit on networking-within-VPC; Azure defaults to a more permissive RBAC inheritance model; GCP defaults to organization-inherited deny. Threats that are obvious on one cloud are surprises on another. A cloud-portable threat model must call out which defaults the design depends on.
Change rate & ephemerality
Cloud workloads are ephemeral; networks are software-defined; identities can be created and destroyed in API calls. A threat model that's correct at design time but stale at runtime is a hazard. Pair the model with runtime sensing (CSPM/CNAPP) so the design intent and the operational state stay aligned.
Worked example: 3-tier app on AWS
The architecture: Route 53 → ALB → EC2 Auto Scaling Group running a Java application → RDS PostgreSQL. KMS for encryption keys. CloudTrail and VPC Flow Logs enabled. A typical SaaS pattern that looks deceptively simple on the diagram. A STRIDE-per-interaction pass produces roughly 20 threats; here's a representative subset.
| # | Element / interaction | STRIDE | Threat | Mitigation |
|---|---|---|---|---|
| 1 | User → Route 53 | S | DNS hijack via registrar account takeover | Registrar MFA, DNSSEC, monitoring NS-record changes |
| 2 | User → ALB | I | TLS downgrade, weak cipher | TLS 1.3 only policy, HSTS, ACM-managed cert |
| 3 | Internet → ALB | D | L7 DDoS exhausting backend | AWS WAF, Shield Advanced, ALB rate-based rules |
| 4 | ALB → App tier | S | Bypass of ALB by reaching EC2 directly | Security group on EC2 allows only ALB SG; private subnets |
| 5 | App tier (EC2) | I | SSRF retrieves IMDS credentials | IMDSv2 required, hop-limit 1, minimal instance-profile permissions |
| 6 | App tier (EC2) | E | Instance profile has overly broad RDS or S3 access | Scope role to specific resources; periodic IAM Access Analyzer review |
| 7 | App tier → RDS | S | SQL connection from compromised host using shared DB password | IAM database authentication; per-app database user; rotation |
| 8 | App tier → RDS | I | Unencrypted in-transit between app and DB | Force SSL on RDS parameter group; verify cert chain in app |
| 9 | RDS | I | Snapshot copied cross-account to attacker account | Resource policy on snapshot sharing; SCP denying snapshot share |
| 10 | RDS | T | Backup integrity not verified | Restore drill cadence; AWS Backup with vault lock |
| 11 | App tier → KMS | E | Key policy allows broad principals | Tight key policy; grant model; aws:SourceVpc condition |
| 12 | KMS | D | Key disabled or scheduled for deletion | Multi-region keys; CloudWatch alarms on disable/schedule actions |
| 13 | CloudTrail | R | Trail disabled by compromised principal | Org trail with S3 + KMS lock; SCP denying trail disable |
| 14 | CloudTrail S3 bucket | T | Log mutation | S3 Object Lock (compliance mode); bucket policy restricting writes to CT |
| 15 | App tier dependencies | T | Supply-chain compromise of a library | SBOM, pinned versions, dependency scanning in CI/CD |
| 16 | Deployment pipeline → EC2 | S | Forged deploy as a release | OIDC federation from CI to AWS; sigstore-signed artifacts; required reviewers |
| 17 | Operator → cloud console | E | Console-only break-glass without time-bound elevation | IAM Identity Center with permission sets; PIM-style time-bound roles |
| 18 | Operator → cloud console | R | Shared root credentials | Disable root key; MFA on root; per-human SSO identities |
| 19 | App → S3 (objects in upload bucket) | I | Public bucket access | Block Public Access at account level; SCP denying its disable |
| 20 | Whole architecture | D | Region-wide outage (single-region design) | Multi-AZ baseline; documented RPO/RTO; multi-region for tier-1 systems |
The exercise is fast - an experienced practitioner produces this list in under two hours. The output is a concrete remediation backlog and a documented design that future reviewers (and auditors) can read.
Worked example: LLM RAG application
The architecture: a customer-support chatbot using an LLM (call it Claude or GPT, your choice) with a RAG layer pulling from a vector database populated by ingesting product documentation and customer-uploaded knowledge bases. The agent has tool-use access to a few internal APIs (lookup ticket, escalate, write summary back to CRM). Cross-reference AI/ML Security for the full treatment.
The architecture introduces three trust boundaries STRIDE-against-a-classic-3-tier-app doesn't see:
- Retrieved content as input. Any document the RAG layer retrieves becomes part of the prompt. A poisoned document - a customer-uploaded knowledge base with embedded prompt-injection content - is an input the system trusts because it came from "your" vector store.
- The model itself. The model is a supply-chain dependency. Provider compromise, model-update behavior change, jailbreak technique discovery, training-data inversion attacks - all are categorically novel threats.
- Tool-use as authorization. Any tool the agent can call is a privilege the agent has. An agent that can call
refund_customercan be coerced into doing so via crafted input. Tool-use authorization is a design problem most systems aren't built to handle.
| Threat | STRIDE | Specifics | Mitigation |
|---|---|---|---|
| Direct prompt injection | S/E | User crafts input that overrides system prompt; agent executes attacker intent | Input filtering, output filtering, system-prompt isolation, untrusted-input markers |
| Indirect prompt injection via retrieved content | T/E | Poisoned document in vector DB redirects agent behavior | Provenance metadata on retrieved docs; treat all retrieved content as untrusted |
| Vector DB poisoning | T | Attacker uploads document that biases retrieval for any future query | Per-tenant isolation in the vector DB; ingestion review; signed ingestion sources |
| Embedding inversion | I | Embeddings reverse-engineered to reveal source text | Don't treat embeddings as safe to expose; access-control embeddings as PII |
| Model supply chain | T | Compromised model weights, training-data poisoning | Use vetted providers; model card review; pinned model versions; integrity checks |
| Agentic tool abuse | E | Agent coerced to call a privileged tool (refund, send email, modify ticket) | Per-tool authorization separate from agent identity; human-in-the-loop on high-impact tools |
| Cross-tenant data leakage | I | Agent retrieves another tenant's vector entries via shared index | Tenant ID enforced in retrieval queries; per-tenant index where threat model demands |
| Sensitive data in prompts | I | PII leaked into model provider's logs / training pipeline | PII redaction at the boundary; provider zero-retention agreement; on-tenant models for sensitive workloads |
| Cost / DoS | D | Adversarial prompts maximizing tokens; recursion in tool use | Per-user rate limits; max-token caps; recursion-depth caps on tool calls |
| Output exfiltration via markdown | I | Model output renders an attacker-controlled image URL with PII in the path | Strip image and link rendering from untrusted output; content security policy on chat UI |
The lesson generalizes: the same STRIDE walk surfaces these threats, but a practitioner who hasn't seen agentic systems before will miss the tool-abuse category entirely. The threat library matters - extending ATT&CK / MITRE ATLAS / OWASP LLM Top 10 into your team's working vocabulary takes minutes and produces better models for years.
Worked example: multi-account AWS landing zone
The architecture: AWS Organizations with a hub-and-spoke account model - management account, log-archive account, security tooling account, network account, and per-environment accounts (dev, staging, prod) plus per-team product accounts. IAM Identity Center for human access; OIDC federation for CI/CD; central transit gateway; org-wide CloudTrail; SCPs and RCPs in place. See Landing Zones for the architecture itself.
The threats that emerge are organization-level - and they're frequently invisible to single-account threat models.
- Management account compromise. The blast radius is the entire organization. Threats: phished root, compromised admin in management account, malicious OU change, billing-account capture. Mitigations: minimal occupants of the management account, SCPs that even the management account respects (yes, you can write them that way), IAM Access Analyzer at the org level, billing console isolated.
- Cross-account trust attacks. Roles in account A that trust principals in account B. Threats: confused-deputy via external-id absence, role-trust policies trusting
*or "any AWS principal," AssumeRole chaining that accumulates privilege. Mitigations: required external-id on cross-account roles; audit of trust policies via Access Analyzer; SCPs constraining who can be assumed. - Organization-level privilege escalation. Threats: assumption of OrganizationAccountAccessRole from management; AWS Control Tower role abuse; iam:PassRole into a service that has org-level privileges. Mitigations: SCPs that deny dangerous actions at the OU level; deny-list policies for the highest-risk APIs; PIM-style time-bound elevation.
- Log-archive integrity. Threats: log archive in a separate account, but with a role from another account that can mutate it. Mitigations: S3 Object Lock in compliance mode; KMS key policy that the archive account alone controls; resource policy denying all but-CloudTrail writes.
- Network plumbing. Threats: transit-gateway route leakage exposing dev to prod; VPC peering misconfiguration; PrivateLink endpoint that exposes a managed service across account boundaries. Mitigations: route-table reviews; SCPs denying VPC peer creation outside-org; network account ownership of all centralized network resources.
- Identity provider drift. Threats: SAML / OIDC trust relationships that survive employee departure; IAM Identity Center permission sets that aren't reviewed; service-control changes by SCP that exempt specific roles. Mitigations: identity-lifecycle automation tied to HRIS; quarterly permission-set review; alerting on SCP changes.
- New-account provisioning. Threats: a newly-created account briefly without all SCPs / Config rules / CloudTrail; a Control Tower customization gap producing a less-secure account. Mitigations: account-vending pipeline that bakes in all controls before the account is ready for use; account-ready-check before handoff.
- Backup / restore boundary. Threats: a snapshot that's restorable cross-account; AWS Backup vaults in workload accounts the workload's compromised principal can wipe. Mitigations: AWS Backup vault in the log-archive account; vault lock; SCP denying snapshot share outside-org.
The threat model at the landing-zone layer is what determines whether the per-workload threat models can rely on the boundaries they assume. If the landing-zone threats aren't worked through, every per-workload model is sitting on assumptions that may not hold.
Threat libraries vs custom
A threat library is a curated list of known threat patterns you start from instead of inventing every threat from first principles. The build-vs-buy decision tracks the maturity of the program.
- Start with libraries. ATT&CK Cloud, CAPEC, OWASP Top 10 / API Top 10 / LLM Top 10, CIS Threat Catalog, NIST SP 800-30 generic threat list. They cover 80% of what a small program needs and they speak the same vocabulary your detection engineers, red team, and auditors do.
- Add domain libraries where relevant. ATT&CK for Containers if you run Kubernetes (see Kubernetes). MITRE ATLAS if you ship ML systems. NIST SP 800-82 if you have OT. STRIDE-LM (the LinkedIn STRIDE machine-learning fork) for ML-specific STRIDE.
- Build a custom library when patterns repeat. The fifth time a team rediscovers the same SSRF-to-IMDS threat, write it into your internal library. Annotate with your default mitigations, your detection coverage, and the team that owns the response. Custom libraries shorten the time-to-first-threat in every subsequent session.
- Keep custom libraries small. A 200-entry custom library is a library nobody can hold in their head. Prefer 30 high-value patterns specific to your stack, plus the public libraries for breadth. Crosswalk to ATT&CK so the custom entries stay readable by outsiders.
Risk scoring threats
The scoring conversation is where threat modeling meets risk management. Several rubrics exist; none are universally good.
DREAD (deprecated)
Damage, Reproducibility, Exploitability, Affected users, Discoverability - Microsoft's original 1-10 rubric per dimension. Informally deprecated; the scores are too subjective, the dimensions overlap, and different reviewers produce wildly different numbers. A small team with strong shared calibration can still use it; most programs have moved on.
CVSS
The Common Vulnerability Scoring System. Excellent for known vulnerabilities with specific exploit characteristics. Poor for design threats, which don't have a defined exploit yet. Use CVSS for CVE prioritization in vulnerability management; don't try to bend it into a design-review rubric.
FAIR (Factor Analysis of Information Risk)
FAIR is a quantitative model: loss-event frequency × loss magnitude, decomposed into estimable factors with calibrated ranges. Outputs dollar-denominated risk that executives can rank against other investments. The setup cost is meaningful (training, calibration, tooling) and the rigor pays off on high-stakes decisions; overkill for day-to-day STRIDE-session triage.
Simple Likelihood × Impact
Three or five buckets for each dimension (Low / Medium / High, or 1-5). Produces a 9- or 25-cell heat map. Imperfect, but fast, repeatable, and good enough to drive ranking inside a session. Most agile threat models live here.
OWASP Risk Rating
A documented Likelihood × Impact methodology with sub-factors (skill required, opportunity, awareness, etc.). Good middle ground between simple heuristic and FAIR; widely used for application-security risk ranking.
Pick one and stay consistent. Mixing rubrics within a program makes risk-register entries incomparable. The choice matters less than the consistency.
Threat modeling as code
The same instinct that produced infrastructure-as-code and policy-as-code produces threat-modeling-as-code: keep the model in the repo, version it, diff it on changes, and run continuous checks against it.
- Pytm - define the threat model in Python: actors, processes, data stores, dataflows, and trust boundaries as code; the library generates DFDs and a STRIDE-aligned threat list. Excellent for repos where the architecture changes faster than the documentation.
- Threatspec - embed threat and mitigation annotations directly in source code as comments; the tool aggregates them into a model. Useful for keeping the model glued to the code it describes.
- OWASP Threat Dragon CLI - runs Threat Dragon models from the command line; integrates into CI to validate that the threat model file exists, parses, and is up to date relative to a defined cadence.
- Threat Composer - AWS-published, browser-based, file-on-disk; produces a structured threat-statement format ("a [threat source] can [prerequisites] [threat action] resulting in [impact] to [asset], leading to [outcome]") that's notably good for clarity.
- IaC → diagram → threats - newer commercial tools generate first-pass DFDs directly from Terraform or CloudFormation, and produce threat lists from the resulting diagrams. Cuts the "I'll update the diagram next sprint" failure mode.
The GitOps pattern: threat model lives at /threat-model/ in the repo; PRs that change architecture must update the model; CI fails the build if the model parses-but-is-older-than-N-days against a control file; periodic review issues open automatically. Same hygiene as policy-as-code; same payoff.
AI in threat modeling
LLM-assisted threat modeling has moved from research to mainstream practice between 2023 and 2026. The pragmatic 2026 pattern uses AI as a first-draft generator and a completeness validator, with a human reviewer making the keep/discard/edit calls.
Where AI helps
- STRIDE generation from a diagram. Upload a DFD; the LLM produces a STRIDE-per-element threat list within seconds. Quality is roughly junior-analyst level - useful as a draft, requires review.
- Completeness checking. "Given this model, what STRIDE categories are missing across elements?" LLMs are good at spotting the obvious gaps a tired reviewer drops.
- Mitigation suggestions. For a given threat, the LLM produces the canonical mitigation list. Excellent for surfacing the standard playbook quickly; weaker on context-specific trade-offs.
- Cross-walks. "Map this threat to ATT&CK techniques, CAPEC patterns, and OWASP top 10 entries." Tedious for humans, fast and accurate for LLMs.
- Diagram generation from text. Describe the architecture in prose; get a first-pass DFD. Still rough, but improving.
Where humans still own the work
- Trust-boundary judgment. The LLM doesn't know your tenancy model, your customer trust assumptions, or which integrations are contractually constrained. Humans set the boundaries.
- Risk acceptance. The decision to accept a risk is an organizational one with budgetary and political dimensions. AI can recommend; only an authorized human can decide.
- Novel threats. Anything that isn't in the LLM's training distribution gets a hallucinated answer or no answer at all. The cutting edge of threat modeling - new attack classes, new architectures - remains human work.
- Auditor sign-off. The auditor needs a human signatory. Tooling outputs become evidence; humans remain accountable.
Vendors are racing to integrate LLM features - IriusRisk, ThreatModeler, Devici, and others have shipped agent-style flows that generate first-pass models from architectural prose. Treat them like AI-assisted coding: productivity multipliers when used well, automation traps when used uncritically.
Common pitfalls
- Threat modeling once and never updating. The model becomes archaeology. Re-run at every meaningful design change; commit the model to the repo so updates are visible in code review.
- Modeling components instead of trust boundaries. The threats live at the boundaries. A model that lists components but doesn't mark the dashed lines is missing the highest-signal part of the analysis.
- No asset valuation. Without knowing what's valuable, every threat looks equally important. Tag every data store and process with a sensitivity / criticality so the threat list can be ranked.
- No mitigation discipline. The output of a session is a threat list with mitigations, owners, and dates. A list of threats without owners is a list of complaints.
- Treating the model as the goal. The model is a means to a safer system. A team that produced ten high-quality models last year but shipped two mitigations from them had a bad year, regardless of how thorough the modeling looked.
- Single methodology fundamentalism. STRIDE-only programs miss privacy threats. PASTA-only programs become a check-the-stages ritual. Combine methodologies - STRIDE for breadth, attack trees for the highest-priority threats, LINDDUN for privacy-sensitive systems.
- Excluding engineering. A threat model produced by security and dropped on engineering is a culture failure. Engineers in the room produce better models and ship more mitigations.
- Ignoring out-of-scope assumptions. "The provider handles that" is a valid statement only when it's documented. Make the shared-responsibility line explicit on the diagram so the assumption is auditable.
- Tool fetishism. Buying a $50,000 platform doesn't produce threat models; it produces a $50,000 expense. Tools support the work; they don't perform it.
- No second-question discipline. Teams that stop at "what can go wrong?" without rigorously answering "what are we doing about it?" produce risk registers that grow forever. The third question is non-optional.
Maturity stages
Stage 1 - Ad-hoc
Threat modeling happens occasionally, driven by individual practitioners. No common methodology. No shared template. Output lives in scattered Confluence pages or design docs that age out.
Stage 2 - Documented
A methodology is documented (typically STRIDE-per-element). A template exists. High-priority projects run a model at design review. Output is captured in a known location; mitigations are tracked.
Stage 3 - Repeatable
Threat modeling is a defined SDLC step. Trained internal facilitators are available across teams. A custom threat library has emerged. Mitigations are tracked through to closure in the engineering tracker.
Stage 4 - Integrated
Threat models live next to code in the repo. CI surfaces stale models. Cross-functional reviews include privacy (LINDDUN), reliability, and compliance. ATT&CK coverage is tracked end-to-end from design through detection.
Stage 5 - Continuous
Threat modeling is a continuous practice, not a project gate. Quantitative risk scoring (FAIR or equivalent) feeds the program's investment decisions. AI assistance is in the workflow with human review. Models, detections, mitigations, and tests share a common vocabulary.
The skip-stage cost
Jumping from Stage 1 to "AI-assisted continuous threat modeling" without the intermediate discipline produces lists nobody owns. Sequence matters; tooling is a force multiplier on a working program, not a substitute for it.
Further reading
Books
- Adam Shostack - Threat Modeling: Designing for Security - the defining textbook of the field.
- Adam Shostack - Threats: What Every Engineer Should Learn from Star Wars - engineer-friendly introduction.
- UcedaVélez & Morana - Risk Centric Threat Modeling - the PASTA reference.
- Threat Modeling Gameplay with EOP - using the Elevation of Privilege card game in practice.
Manifestos and standards
- Threat Modeling Manifesto - the field's "agile manifesto" equivalent.
- OWASP Threat Modeling - community-curated reference.
- MITRE ATT&CK Cloud Matrix
- MITRE ATLAS - ML-specific threat library.
- CAPEC - attack pattern catalog.
- MITRE D3FEND - defender-side counterpart to ATT&CK.
- OWASP LLM Top 10
Tools
- OWASP Threat Dragon
- Microsoft Threat Modeling Tool
- AWS Threat Composer
- Pytm
- Threatspec
- IriusRisk
- ThreatModeler
- Devici
Related CSOH pages
- API Security - the layer where the application-side STRIDE work concentrates.
- AI/ML Security - threat modeling for LLM and agentic systems in depth.
- GRC - how threat-model outputs feed control evidence and risk registers.
- IAM & Identity - the primitive most cloud threats turn on.
- CSPM vs CNAPP - the runtime sensors that validate the design-time model.
- Cloud SOC - translating threats into detections.
- Detection Engineering - coverage analysis against ATT&CK.
- Landing Zones - the org-level architecture worked-example sits on.
- Glossary - every term on this page, defined.
FAQ
When should I threat model?
As early as possible - ideally during design review, before architectural decisions are baked into code. The marginal cost of changing an architectural choice on a whiteboard is near zero; the same change after six months of implementation can be a quarter of engineering effort. Re-threat-model whenever the trust boundary changes - new third-party integration, new identity type, new data classification, new region, new tenancy model. For mature programs the rhythm is: a 30-minute lightweight session per significant design change, plus a deeper review on a quarterly cadence for high-criticality systems.
STRIDE vs PASTA - which should I use?
STRIDE is a categorization checklist - six threat classes applied to each element or interaction in your model. It's fast, teachable, and produces a usable list in a 30-minute session. PASTA is a seven-stage risk-centric methodology that runs from business objectives through attack simulation to risk scoring. It produces a more rigorous artifact for high-stakes systems but costs days, not minutes. Most practitioners use STRIDE as the day-to-day tool and reach for PASTA when an executive sponsor wants quantified risk on a regulated or revenue-critical workload.
Who should be in the room?
The smallest group that can answer the four questions. At minimum: the engineer or architect who knows the system best, a security practitioner familiar with the relevant cloud, and a product or operations partner who knows the data sensitivity and SLAs. For larger or higher-risk systems add an IAM specialist, a privacy or compliance reviewer (especially for LINDDUN), and a representative of any downstream team your trust boundary touches. Avoid making it a 12-person ceremony - past 6 attendees the signal-to-noise drops sharply.
How long should it take?
For a typical microservice or feature: 30-60 minutes for a STRIDE-per-element pass on a whiteboard, plus another hour or two to write up the findings and mitigations. For a complete new system: a half-day workshop, then a follow-up review of the documented model. PASTA-style modeling on a regulated workload can run multiple days across stages. The discipline isn't the duration - it's that the session ends with a written list of threats, ranked, with owners and target dates for mitigations.
Is DREAD still recommended for risk scoring?
DREAD has been informally deprecated for over a decade - the scores are subjective, the dimensions overlap, and different reviewers produce different numbers on the same threat. Modern programs use a simple Likelihood × Impact heuristic for triage, CVSS only for known CVEs (it's a vulnerability metric, not a design metric), and FAIR when the business needs dollar-denominated risk for investment decisions.
How does threat modeling differ for an LLM-based application?
The classic STRIDE categories still apply - but the trust boundaries shift. Retrieved content becomes a new attack surface (prompt injection via documents in a vector store), the model itself is a supply-chain dependency, and tools the agent can call are part of your attack surface (any agent that can call a payment API can be coerced into doing so). See AI/ML Security for the full treatment, including OWASP LLM Top 10 and MITRE ATLAS.
Can AI tools write threat models for me?
Increasingly they can produce a credible first draft - generate STRIDE elements from a diagram, suggest mitigations from a threat library, validate completeness against known patterns. But they don't yet replace the human judgment that decides which threats your organization will accept, which to mitigate, and how much engineering budget to spend on each. The pragmatic 2026 pattern is LLM-assisted generation with a human reviewer who edits the output, prunes false positives, and signs the result.
Where next
- API Security - the application-layer threats that show up first in most cloud workloads.
- AI/ML Security - extend the methodology to LLMs and agents.
- GRC - where the threat-model output meets the audit evidence chain.
- IAM & Identity - most cloud threats turn on this primitive.
- Detection Engineering - translate the modeled threats into runtime detections.
- Friday Zoom - threat modeling sessions are a recurring topic. Drop in.