CI/CD for Cloud Deployments

The pipeline that ships code from a developer's laptop to a production cloud account. Vendor-neutral guide to what's in a modern pipeline, how OIDC federation replaced long-lived access keys, and how AWS, Azure, and GCP toolchains compare side-by-side.

A modern server room featuring network equipment with blue illumination
Photo by panumas nikhomkhai on Pexels

ยท ยท Vendor-neutral ยท View source on GitHub

The 30-second version: A CI/CD pipeline takes a developer's commit and turns it into running cloud infrastructure - automatically, repeatably, and with a security boundary at every step. CI (continuous integration) builds and tests the change; CD (continuous delivery / deployment) ships it to a cloud account.

The modern shape: a git push triggers a pipeline runner (GitHub Actions, GitLab CI, Jenkins, CodePipeline, Cloud Build, Azure DevOps), which builds an artifact, runs tests and security scans, then assumes a short-lived cloud role via OIDC federation - no long-lived access keys anywhere - and applies the change. Every step is in version control. Every step is auditable. The pipeline is the only identity allowed to change production.

On this page

  1. What CI/CD actually is
  2. Why it matters for cloud deployments
  3. The anatomy of a modern pipeline
  4. OIDC federation - replacing long-lived keys
  5. Secrets in pipelines
  6. AWS, Azure, and GCP side-by-side
  7. AWS - CodePipeline, OIDC, and the IAM model
  8. Azure - Azure DevOps, GitHub Actions, Workload Identity
  9. GCP - Cloud Build and Workload Identity Federation
  10. Deployment strategies - blue/green, canary, progressive
  11. Securing the pipeline itself
  12. Infrastructure-as-code in the pipeline
  13. Bootstrapping path
  14. Common pitfalls
  15. Further reading
  16. FAQ

What CI/CD actually is

Strip the marketing off and CI/CD is two related practices, glued together by an automated pipeline.

Continuous Integration (CI) is the practice of merging code changes into a shared branch frequently - many times per day per developer - and having an automated build and test run on every merge. The point is that problems surface in minutes, not at the end of a multi-week integration phase. CI is what makes "main is always green" a realistic goal.

Continuous Delivery (CD) extends the pipeline so every passing build is automatically packaged into a deployable artifact and shipped to a staging environment. A human still clicks "release to production." Continuous Deployment goes one step further: every passing build is automatically released to production, with the pipeline's automated checks and progressive rollout as the safety net.

In practice, the acronym "CD" stretches over both. What matters is the pipeline shape: commit โ†’ build โ†’ test โ†’ scan โ†’ deploy, automated end-to-end, with the cloud-side credentials short-lived and tightly scoped.

None of this is cloud-specific in principle. It became essential for cloud because cloud infrastructure is itself code - every account, every network, every IAM role - and the pipeline is the only sustainable way to keep that code in sync with reality across dozens of environments.

Why it matters for cloud deployments

Two teams running the same workload in the cloud. Both deploy to AWS, Azure, or GCP.

Team A deploys from a developer laptop. The laptop has a long-lived access key with broad permissions. The Terraform state lives on someone's hard drive. When the original engineer goes on vacation, nobody knows what version is in prod or how to roll it back. When a credential leaks to GitHub, the blast radius is "everything that engineer had access to" - usually most of the account.

Team B deploys from a pipeline. The pipeline runs in CI, authenticates to the cloud via OIDC for a session that lasts 15 minutes, applies a Terraform plan reviewed in a pull request, and writes the state to a remote backend with locking. Every deploy is in git history. Rolling back is reverting a commit. When a developer's laptop is compromised, the attacker gets nothing useful for production - the developer cannot deploy directly.

The CI/CD pipeline is the difference between those two paths. It's the operational expression of three security principles at once:

The pipeline also pays back hard on day 200, when an org has 50 workloads in 50 accounts. Without it, every workload's deploy is bespoke. With it, every workload's deploy follows the same shape - scan, plan, apply, verify - and a single change to the template improves all 50.

The anatomy of a modern pipeline

Every credible cloud-deploy pipeline has the same eight stages. The runner differs; the shape doesn't.

1. Source & trigger

A git event (push, merge to main, tag, manual dispatch) kicks off the pipeline. The commit SHA is the unit of work - every later stage refers back to it.

2. Build

Compile source. Package into the deploy artifact - a container image, a zip, a Terraform plan, a Helm chart. Pin every dependency version. Cache aggressively.

3. Test

Unit tests, integration tests, contract tests. Fast feedback first; slower end-to-end tests last. Fail closed - a flaky test means the build fails, not the test getting ignored.

4. Scan

SAST on source, SCA on dependencies, secret scanning on the diff, container scanning on the image, IaC scanning on the Terraform / Bicep / CloudFormation. SBOM generation.

5. Sign & attest

Sign the artifact (Cosign / Sigstore). Attach provenance attestations (SLSA) so the deploy target can verify what built it.

6. Authenticate to cloud

OIDC token exchange for short-lived credentials. Scoped to one repo, one branch, one role. No long-lived access keys stored in CI.

7. Deploy

Terraform apply, helm upgrade, gcloud run deploy, az deployment, kubectl apply, ECS / Lambda / Functions update. Progressive rollout where possible (canary, blue/green).

8. Verify & observe

Smoke tests against the deployed environment. Watch error rates, latency, and security findings for a defined window. Auto-rollback on regression.

A pipeline missing any of these stages isn't necessarily wrong - small projects can collapse stages 3, 4, and 5 into one job - but it should be a deliberate choice, not an omission. The gaps are where breaches live.

OIDC federation - replacing long-lived keys

If there's one thing to take from this page, it's this: do not store long-lived cloud access keys in your CI system. Use OIDC federation instead.

The pattern works the same way on every cloud:

  1. Your CI runner (GitHub Actions, GitLab, CircleCI, Buildkite, Jenkins with a plugin) is configured as an OIDC identity provider. Every workflow run gets a unique, short-lived signed JWT describing which repo, which branch, which workflow, which run, which actor.
  2. The cloud side trusts that issuer for specific subject claims. AWS IAM, Azure Entra, and Google Cloud all support this out of the box.
  3. When the pipeline needs to deploy, it presents its OIDC token. The cloud verifies the issuer's signature, checks the subject matches a configured trust, and returns short-lived credentials (15 minutes to 1 hour) scoped to a specific role.
  4. The pipeline uses those credentials, then they expire.

Why this matters:

Long-lived access keys in CI are now considered a legacy pattern - virtually every leak postmortem from the last several years (Travis CI 2022, CircleCI 2023, dependency confusion incidents) features one. OIDC removes the secret entirely.

Trust scoping is the security boundary. When you configure OIDC on the cloud side, the subject claim is what stands between your production role and any random pull request. Scope to ref:refs/heads/main at minimum. Scope to specific environment: claims if your CI supports them. Never accept a wildcard subject - that's equivalent to leaving the role open to every workflow run in your repo, including untrusted fork PRs.

Secrets in pipelines - the hierarchy

OIDC (the previous section) handles authenticating the pipeline to the cloud. Most real pipelines also need other secrets: third-party SaaS API tokens, signing keys, database credentials for migration steps, webhook secrets, npm/PyPI publish tokens. The principle: in order of preference, prefer the option that leaves no static secret in CI at all.

  1. OIDC / workload identity to the destination. If the secret is for a cloud service, federate directly - don't store it. Most cloud APIs and a growing number of SaaS providers (npm, PyPI, Docker Hub via trusted publishing) now accept OIDC tokens directly.
  2. Vault retrieval at job-start using OIDC. Authenticate to Vault / AWS Secrets Manager / Azure Key Vault / GCP Secret Manager using the pipeline's OIDC identity, fetch the secret, hold it in memory for the job, never persist. The secret rotates in the vault; the pipeline picks up the new value on the next run.
  3. CI-native secret store, scoped tight. When neither of the above is possible, use the CI provider's secret store with the narrowest scope it supports (environment-scoped, not repo-wide; required approval on the environment).
  4. Static value in plaintext anywhere. Never. Workflow files, comments, debug logs, Slack DMs to the on-call engineer - this is how every credential-leak postmortem begins.

GitHub Actions

GitLab CI

Azure DevOps

CircleCI, Buildkite, Jenkins, Argo

Self-hosted runner hygiene

Self-hosted runners are the most-abused secret-exposure surface in CI. The compromise pattern is consistent: a runner persists between jobs, a malicious PR runs cat ~/.aws/credentials or scans environment variables, exfiltrates whatever the previous job left behind, and the attacker reuses it indefinitely.

Secret-scanning gates

Pre-commit and CI scanners catch the slip-ups the workflow above should have prevented. They are the safety net, not the strategy - if they're catching things regularly, the upstream needs fixing.

For the broader identity story underlying all of the above, see IAM - Secrets in developer workflows. For the underlying secrets-manager capabilities, see Data Security & KMS - Secrets management.

AWS, Azure, and GCP side-by-side

Each cloud ships a native CI/CD toolchain. Most teams use a hybrid - a third-party CI runner (GitHub Actions, GitLab) for build/test, and the cloud's native primitives for the deploy and runtime side. The capabilities map nearly one-to-one:

Building block AWS Azure GCP
Native CI/CD service CodePipeline + CodeBuild + CodeDeploy Azure DevOps Pipelines Cloud Build + Cloud Deploy
Source hosting CodeCommit (deprecated for new) - usually GitHub Azure Repos - usually GitHub Cloud Source Repositories - usually GitHub
Artifact registry ECR, CodeArtifact, S3 Azure Container Registry, Artifacts Artifact Registry
OIDC federation IAM OIDC provider + role with trust on token.actions.githubusercontent.com Workload identity federation in Entra ID app registration Workload Identity Federation pool + provider
Container deploy target ECS, EKS, App Runner, Lambda AKS, Container Apps, App Service, Functions GKE, Cloud Run, Cloud Functions
Progressive delivery CodeDeploy (canary, linear, all-at-once) Deployment strategies in Pipelines + Container Apps revisions Cloud Deploy phased rollouts + Cloud Run traffic splits
IaC native CloudFormation, CDK ARM, Bicep Deployment Manager (deprecated), Config Controller
IaC vendor-neutral Terraform / OpenTofu, Pulumi Terraform / OpenTofu, Pulumi Terraform / OpenTofu, Pulumi
Secret store Secrets Manager, SSM Parameter Store Key Vault Secret Manager
GitOps option Argo CD / Flux on EKS, AWS Proton Argo CD / Flux on AKS, GitOps for AKS (Flux extension) Argo CD / Flux on GKE, Config Sync

Anyone who's built a CI/CD pipeline on one cloud has 80% of the conceptual model needed to build one on another. The remaining 20% is the IAM/Entra/Workload Identity specifics and the deploy primitives of each platform.

AWS - CodePipeline, OIDC, and the IAM model

AWS gives you two real choices: build your pipeline entirely inside AWS (CodePipeline / CodeBuild / CodeDeploy), or run the CI in GitHub Actions / GitLab and call out to AWS for the deploy. The second pattern is more common in practice - most teams already live in GitHub.

The OIDC setup

Create an IAM OIDC identity provider for token.actions.githubusercontent.com in your AWS account. Create an IAM role with a trust policy that requires the OIDC subject claim to match your repo and branch - e.g. repo:myorg/myrepo:ref:refs/heads/main. In your GitHub Actions workflow, use aws-actions/configure-aws-credentials with role-to-assume - no aws-access-key-id, no aws-secret-access-key.

The key AWS-specific decisions

For the in-AWS-only path: CodePipeline orchestrates, CodeBuild runs the build, CodeDeploy handles progressive rollout. It's heavier to set up than GitHub Actions + OIDC but keeps the entire pipeline inside the AWS account boundary - useful for FedRAMP / regulated workloads.

Azure - Azure DevOps, GitHub Actions, Workload Identity

Azure customers split roughly between Azure DevOps Pipelines (Microsoft's hosted CI/CD, still widely used) and GitHub Actions (also Microsoft-owned, now the default for new projects). Both authenticate to Azure the same way: workload identity federation.

The OIDC setup

Create an app registration in Entra ID. Add a federated identity credential that trusts your CI's OIDC issuer - https://token.actions.githubusercontent.com for GitHub Actions, or the Azure DevOps issuer URL. Assign the app the minimum RBAC role on the target subscription or resource group. In the pipeline, use Azure/login with client-id, tenant-id, subscription-id - no client-secret.

The key Azure-specific decisions

Workload identity federation on Azure is the same security shape as OIDC on AWS - short-lived tokens, no client secrets, scoped to a specific repo and branch. The configuration is in Entra rather than IAM, but the threat model is identical.

GCP - Cloud Build and Workload Identity Federation

Google ships Cloud Build (CI), Cloud Deploy (continuous delivery for GKE and Cloud Run), and Artifact Registry (artifact storage). They integrate cleanly, and Cloud Build runs as a Google-managed service account in your project - so for fully-in-GCP pipelines, no federation is needed.

For GitHub Actions deploying to GCP, the pattern is Workload Identity Federation.

The Workload Identity Federation setup

Create a workload identity pool in your GCP project. Add a provider for token.actions.githubusercontent.com with attribute mappings that pull the GitHub subject into a Google identity. Grant the IAM service account permission to be impersonated by the pool subject, scoped to your repo. In GitHub Actions, use google-github-actions/auth with workload_identity_provider and service_account - no JSON service-account keys.

The key GCP-specific decisions

The Cloud Build + Cloud Deploy + Artifact Registry trio is the most cohesive native CI/CD stack of the three big clouds - for a workload that lives entirely on GCP and uses containerized deploys, it's the path of least resistance.

Man analyzing business data and financial graphs on a laptop
Photo by Kaboompics on Pexels

Deployment strategies - blue/green, canary, progressive

"Click deploy and hope" is a deployment strategy. It's just a bad one. The cloud's value here is that better strategies are cheap - you have elastic capacity, traffic-shifting load balancers, and managed control planes that make non-trivial rollouts safe by default.

All-at-once

Replace every instance simultaneously. Fast, simple, no extra capacity. Outage during the swap if anything fails. Defensible only for stateless workloads with short startup time, in non-prod, or behind a feature flag.

Rolling

Replace instances one batch at a time (10% โ†’ 25% โ†’ 50% โ†’ 100%). No extra capacity, no outage during deploy, but rollback is slow because you're rolling backward through the same process. The default for Kubernetes deployments.

Blue/green

Two complete environments. Deploy to the idle one (green), smoke-test it, flip 100% of traffic over. Rollback is instant - flip back to blue. Doubles infrastructure during the deploy window. Best fit for stateful workloads where partial rollouts are risky.

Canary

New version runs alongside old. Send 1% of traffic to it; watch error rates and latency. Ramp to 5%, 25%, 100% over minutes or hours. Auto-rollback on metric regression. Needs reliable signals - a flaky SLO means false rollbacks. The safest strategy when you have the observability to support it.

Native support varies by deploy target:

Pick the simplest strategy that meets the workload's blast radius. A static site can do all-at-once. A payments service should not.

Securing the pipeline itself

The pipeline is now the soft underbelly of cloud security. Its credentials are the keys to production; its source artifacts are what runs in prod; its build environment is where supply-chain attacks land. Treat the pipeline like a privileged production system, because it is one.

Infrastructure-as-code in the pipeline

If the application code goes through the pipeline, the infrastructure code should too. The same review, scan, and deploy controls that protect your app should protect the network, IAM, and data stores it runs on.

The typical IaC pipeline shape:

  1. Format & validate - terraform fmt -check, terraform validate. Trivially cheap, catches typos before review.
  2. Policy scan - Checkov, tfsec, Trivy, Open Policy Agent / Conftest, or your CSPM/CNAPP's IaC scanner. Blocks merges that violate guardrails (public buckets, unencrypted disks, IAM wildcards).
  3. Plan on pull request - terraform plan output posted as a PR comment. Reviewers see exactly what will change.
  4. Apply on merge to main - pipeline assumes deploy role via OIDC, runs terraform apply, writes state to the remote backend. State backend access is restricted to the deploy role.
  5. Drift detection on a schedule - periodic plan-only runs alert when reality has drifted from code (someone clicked something in the console). Drift is the canary for broken IaC discipline.

Terraform / OpenTofu is the dominant vendor-neutral choice. Pulumi is the same shape with familiar languages. Each cloud's native IaC (CloudFormation/CDK, Bicep, Config Controller) is a credible single-cloud alternative - pick what your team already operates.

For Kubernetes-targeted infrastructure, GitOps controllers (Argo CD, Flux) push this pattern further: the pipeline updates a Git repo with the desired state; the in-cluster controller pulls and reconciles. The cluster itself trusts only the Git repo and the controller's identity, not the CI runner.

Bootstrapping path

For a team standing up CI/CD from scratch into a cloud, a sane order of operations:

  1. Pick your CI runner first. GitHub Actions is the safe default - broad ecosystem, OIDC support on all three clouds, generous free tier. Azure DevOps, GitLab CI, CircleCI, Buildkite all work; the choice is mostly about where your team already lives.
  2. Set up OIDC federation before your first deploy. Configure the cloud-side identity provider and a deploy role with narrow permissions. Verify the trust by running a no-op assume-role from a workflow. Never ship the first version with long-lived keys "just to get it working" - those keys live forever afterward.
  3. Build a "hello world" pipeline that deploys one trivial resource. A single S3 bucket, a single resource group, a single Cloud Run service. The point is to validate the wiring end-to-end before you have anything to break.
  4. Add the security stages. Secret scanning on every PR. IaC scanning. Container scanning. Wire failures so they block the merge.
  5. Add tests. Unit tests in the CI side; smoke tests in CD. End-to-end tests on the deployed environment.
  6. Split prod from non-prod environments. Different deploy roles. Different trust policies. Different state backends. Production trust scoped to the main branch only; non-prod can be looser.
  7. Add a deploy strategy beyond all-at-once. Rolling or canary, depending on the workload. Wire an automated rollback signal - error rate threshold, healthcheck failure, alarm trip.
  8. Wire pipeline observability. Slack/Teams notifications on failure. Pipeline run audit log into your SIEM. Metrics on deploy frequency, lead time, change failure rate, MTTR (the DORA metrics).

Realistic time investment for a small team: a couple of weeks to a minimally-credible production pipeline; a quarter or two to one that you'd be comfortable scaling to ten more workloads on.

Common pitfalls

Further reading

Cloud-native references

Supply chain & pipeline security

Practice & metrics

Related CSOH pages

FAQ

What is the difference between CI and CD?

Continuous Integration (CI) is the build/test side - merging code changes frequently and verifying each merge automatically. Continuous Delivery (CD) is the deploy side - every passing build is automatically packaged and shippable, with a human gating the final release. Continuous Deployment automates the final step too: every passing build goes to production. "CI/CD" as a phrase covers all of it.

Should I use the cloud's native CI/CD or a third-party tool?

For most teams, GitHub Actions (or your existing CI) for build/test and the cloud's native primitives for deploy targets is the right answer. The cloud-native CI services (CodePipeline, Azure DevOps, Cloud Build) are credible but rarely a reason to leave a tool your team already uses. The exception is regulated environments where keeping the entire pipeline inside the cloud account boundary simplifies compliance.

Do I need progressive delivery for a small project?

Not on day one. A static site or a single-team prototype can do all-at-once deploys safely - the blast radius is small and the rollback is fast. Reach for canary or blue/green when the cost of a bad deploy (real users seeing errors, real money lost) is higher than the cost of running the extra capacity.

How does GitOps fit in?

GitOps is a CD pattern, not a replacement for CI. CI still builds and tests artifacts; CI commits the desired state (image tag, manifest) into a Git repo; an in-cluster controller (Argo CD, Flux) pulls and reconciles. The win is that the target cluster never trusts the CI runner - only itself and the Git repo - so a compromised CI cannot directly push into the cluster.

What about secrets - do I still need a secret store with OIDC?

Yes. OIDC removes cloud credentials from CI, but workloads still need third-party API keys, database passwords, signing keys. Those live in Secrets Manager / Key Vault / Secret Manager and are fetched by the workload at runtime - not stored in the pipeline. CI only ever has the cloud role needed to reference the secret, never the secret itself.

How fast should my pipeline be?

The DORA elite benchmark is lead time under one hour from commit to production. For most cloud workloads, that's achievable: 5-10 minutes of CI, 5-10 minutes of deploy + verify, the rest in queue. If your pipeline is over an hour, the problem is usually serialized stages that could run in parallel, or a verification stage waiting on a flaky external dependency.

Can I do CI/CD without containers?

Yes. The pipeline pattern is independent of the artifact type. Lambda zips, VM images (Packer), unikernels, static site bundles, Terraform plans - all flow through the same eight stages. Containers happen to be the dominant artifact today because they unify build + runtime; the CI/CD shape is older than containers and survives them.

How does this relate to zero trust?

The pipeline operationalizes zero trust at the deploy layer. "Verify explicitly" is OIDC federation - every deploy is a fresh authentication with a fresh subject claim. "Least privilege" is per-environment deploy roles with permission boundaries. "Assume breach" is auditability - every change traces to a commit, a reviewer, and a run id, so a compromise has a finite, observable blast radius.

Where next