Serverless Functions & Cloud Security

Code that runs only when something happens - an HTTP request, a queue message, a file landing in storage - and disappears when it's done. Vendor-neutral guide to AWS Lambda, Azure Functions, and Google Cloud Functions: what they are, where they shine, where they fail, and the security implications of event-driven, short-lived compute.

From above contemporary server cable trays in a modern data center
Photo by Brett Sayles on Pexels

· · Vendor-neutral · View source on GitHub

The 30-second version: A serverless function (a.k.a. FaaS) is code the cloud runs in response to an event - HTTP request, queue message, file upload, schedule - without you provisioning, patching, or managing a server. AWS Lambda, Azure Functions, and Google Cloud Functions are the dominant products. The platform owns the OS, the runtime, the scaling, and the host-level isolation; you own the code, the IAM role, the event configuration, and the dependencies.

Security shifts accordingly. Kernel-level attack surface largely disappears. What replaces it: event injection from untrusted sources, identity sprawl across hundreds of small functions, supply-chain risk in every dependency, and denial of wallet as the new flavor of DoS.

On this page

  1. What a serverless function actually is
  2. Why use serverless
  3. The good and the bad
  4. The trust boundary
  5. The seven risk categories that matter
  6. Event injection - untrusted payloads
  7. Identity sprawl & per-function roles
  8. Supply chain - every dep is runtime
  9. Secrets, env vars, and config
  10. Networking, egress, and VPCs
  11. Denial of wallet
  12. Logging & observability
  13. AWS, Azure, and GCP side-by-side
  14. Hardening checklist
  15. Common pitfalls
  16. Further reading
  17. FAQ

What a serverless function actually is

Strip the marketing off and a serverless function is three things:

  1. A short piece of code - usually one entry point that takes an event object and returns a result. JavaScript, Python, Go, Java, .NET, Ruby, custom runtimes via a thin shim. Code lives in a deploy package (zip, container image, or a managed runtime layer).
  2. An event configuration - declaring what causes the function to run. HTTP routes via an API gateway, queue messages, blob/object events, scheduled timers, database streams, message-bus events, identity events.
  3. An execution role - the cloud identity the function assumes when it runs. Permissions for whatever services the code needs to call (read a row from DynamoDB, write to S3, call SQS, etc.).

When an event arrives, the platform allocates an isolated execution environment (a per-invocation micro-VM on AWS - Firecracker - and equivalents on Azure and GCP), copies your code in, runs it for at most a few minutes (varies by provider; Lambda caps at 15 minutes), and discards the environment. From your side: no servers visible, no nodes to patch, no operating system to maintain. The platform handles all of that.

The serverless name is a marketing term - there are obviously servers underneath. The accurate framing is functions as a service: the unit of deployment is a function, the unit of billing is the millisecond of execution, and the platform owns everything between your code and the silicon.

Why use serverless

The model maps cleanly onto a specific shape of workload - event-driven, intermittent, with bursty concurrency:

The typical fit: API endpoints that don't need persistent state, ETL pipelines that fire when a file lands, scheduled jobs, fan-out and fan-in patterns, lightweight integrations between SaaS services, and any "respond to this cloud event" use case.

The good and the bad

Serverless is good for some shapes, bad for others. The honest tradeoff matrix:

Dimension The good The bad
Cost (low traffic) Pennies. Truly free at small scale. -
Cost (sustained high traffic) - Above a few thousand QPS continuously, a long-running container or VM is usually 3-10× cheaper.
Operational toil No nodes, no patches, no OS to manage. Debugging a misbehaving function in production is harder - no shell, no kubectl exec, only logs and traces.
Scaling Zero to thousands in seconds, automatic. Concurrency limits, cold starts, downstream-dependency throttling. The platform scales; the database the function calls may not.
Latency Warm invocations are fast (single-digit ms overhead). Cold starts add 100ms-several seconds. Bad for latency-sensitive paths unless provisioned concurrency is used (which costs more).
Long-running work - Hard caps: Lambda 15 min, Cloud Functions 60 min (v2), Azure Functions Premium 60 min default. Work that exceeds these doesn't fit.
Stateful work Stateful operators can run side-by-side (Step Functions, Durable Functions, Workflows). Functions themselves are stateless; state goes to external storage on every invocation.
Local dev / testing Frameworks like SAM, Serverless Framework, Azure Functions Core Tools. Local emulators always diverge from prod in subtle ways - event format differences, IAM behavior, throttling.
Vendor lock Code itself is portable. Event wiring, IAM, observability, and Step-Functions-equivalent orchestration are not. Migrating between clouds is real work.
Security surface No host kernel to share; no nodes to compromise. Attack surface shifts to identity, events, dependencies, and cost. Same total surface - different shape.

A working heuristic: if your workload is event-driven, intermittent, mostly stateless, and finishes in under 15 minutes, serverless is usually the right default. If it's sustained throughput, long-running, or has a hard latency budget under 100ms p99, containers or VMs are usually a better fit.

The trust boundary

For each function invocation, the security-relevant boundary is straightforward:

The platform genuinely makes the boundary stronger than a long-lived container or VM in some ways - fresh execution environment per invocation, hypervisor isolation by default, no host to patch. It also makes some kinds of compromise cheaper: stealing a function's IAM credentials and using them outside the function is easy if the function leaks them, and the function only needs to be compromised once to do that.

The seven risk categories that matter

Serverless security in the cloud collapses into seven categories. The names differ across vendors; the threats don't.

1. Event injection

Untrusted data lands in the function via an event payload - S3 object metadata, SQS message body, API Gateway request, EventBridge event. Same OWASP injection classes, new entry points.

2. Identity sprawl

Hundreds of functions, each with its own role, each accumulating "just in case" permissions. Hard to audit; high blast radius when one role escalates.

3. Supply chain

Every npm / pip / Maven dependency is loaded into the runtime on every cold start. A poisoned package is in production within one deploy.

4. Secrets handling

Secrets in env vars are visible via the console, logs, error stacks, and any code that prints process.env. The "obviously easy" pattern is the leakiest one.

5. Network egress

Default-egress-to-the-internet is the standard. A compromised function calls home; nothing in the platform stops it unless you put it in a VPC with restricted egress.

6. Denial of wallet

Public function URLs, unrate-limited APIs, recursive event loops. The attack target isn't uptime - it's your invoice. A weekend of fuzzing can produce a five-figure bill.

7. Observability gap

No shell, no kubectl exec, no host-level forensics. If you didn't log it, it didn't happen - and an attacker who knows that has every reason to keep their activity inside that gap.

Event injection - untrusted payloads

The most underestimated serverless risk. A function triggered by an event receives a structured payload - and that payload often contains attacker-controlled fields.

Some real entry points:

Defenses

Identity sprawl & per-function roles

The right pattern: one IAM role per function, each scoped to the minimum permissions that function needs. The frequent reality: a small number of broad "this role works for most of our Lambdas" roles, reused across dozens of functions, all over-permissioned.

Why this drifts:

The blast-radius math: if 50 functions share one role with s3:* on *, then any of those 50 functions being compromised gives the attacker access to every S3 bucket in the account. With 50 distinct roles each scoped to one bucket, the blast radius is one function's worth of data.

Defenses

Supply chain - every dep is runtime

In a long-lived container, dependencies are pulled at build time and frozen. In a serverless function, dependencies are still pulled at build time - but the cold-start surface means every dependency is loaded on every fresh execution environment. The attack window is identical, but the velocity of "I just upgraded a dependency and it shipped to prod" is higher because deploy cycles are shorter.

Real serverless supply-chain incidents:

Defenses

See the CI/CD page for the pipeline side of this same problem.

Secrets, env vars, and config

Every serverless platform offers "environment variables" as the easy way to inject configuration. They're convenient, they're standard, and they're the wrong place for secrets.

Why env vars are leaky:

Defenses

Networking, egress, and VPCs

By default, a serverless function runs outside your VPC. It has internet egress (which means it can call any external API) and can be invoked over the internet through an API Gateway or Function URL. The "outside the VPC" default is a security choice with two faces:

When you should attach the function to a VPC

When the VPC isn't worth it

Whether or not you VPC-attach, egress to known-bad destinations should be blocked at some layer - a NAT gateway allow-list, a managed firewall (AWS Network Firewall, Azure Firewall, Cloud NGFW), or runtime egress policy if your CNAPP supports it.

Denial of wallet

The serverless equivalent of DDoS. Instead of taking your application offline, the attacker drives up your invocation count and your cloud bill becomes the casualty. Recent incidents have reported five-figure damages from a single weekend.

Common vectors:

Defenses

Logging & observability

The serverless equivalent of "did anything weird run last month?" is "do we have the logs to answer that?" - and the default answer is often no.

What's missing relative to a VM or container:

What good looks like

Man analyzing business data and financial graphs on a laptop
Photo by Kaboompics on Pexels

AWS, Azure, and GCP side-by-side

Three major FaaS platforms, broadly similar in shape, different in details.

Building block AWS Lambda Azure Functions Cloud Functions
Max execution time 15 min 5 min (Consumption) / unlimited (Premium & Dedicated) 60 min (v2 / Cloud Run-based)
Per-invocation isolation Firecracker micro-VM Sandboxed worker (Consumption); container (Premium) gVisor on Cloud Run infrastructure
Identity model IAM execution role per function Managed identity per function app Per-function service account
Public invocation Function URL, API Gateway, ALB HTTP trigger, API Management HTTPS trigger, API Gateway
Secrets integration Secrets Manager / SSM via env var or extension Key Vault reference syntax in app settings Secret Manager via env var or runtime fetch
VPC attach Optional; Hyperplane ENIs minimize cold-start cost Premium & Dedicated only VPC connector / direct VPC egress
Concurrency control Reserved & provisioned concurrency Function-app scale-out limits Max instances per function
Signing AWS Signer for code packages Container image signing on Premium Binary Authorization (container deploys)
Native observability CloudWatch Logs, X-Ray, Lambda Insights Application Insights, Azure Monitor Cloud Logging, Cloud Trace
Runtime threat detection GuardDuty Lambda Protection Defender for Cloud / App Service Security Command Center (Cloud Run-shared signals)

The three converge over time as each picks up the others' best ideas. The differences worth caring about: max execution time (15 min cap on Lambda is sometimes a deal-breaker), VPC attach behavior (Lambda's Hyperplane is the cleanest of the three), and how secrets are referenced (Azure's Key Vault reference syntax is the most ergonomic).

Hardening checklist

The non-negotiable serverless hardening list:

Identity

One execution role per function. Resource-level permissions where possible. Permission boundary on every role. No wildcards in production.

Input

Schema-validate every event at the entry point. Treat user-controlled and "system" fields alike as untrusted. Constrain who can push to event sources.

Supply chain

Lockfile-pinned dependencies, hash-verified at install. CI scanning that fails on HIGH/CRITICAL CVEs. Signed deployments verified at the deploy boundary.

Operational guardrails

Reserved concurrency. API Gateway throttling. Budget alarms with auto-pause. Centralized logging + traces. Auth on every public function.

Layer on top: a secrets store for anything sensitive (never env vars), egress controls if the function is internet-callable, and runtime / CNAPP telemetry for the things logs can't show you.

Common pitfalls

Further reading

Vendor docs & best practices

Standards & frameworks

Frameworks & tooling

Related CSOH pages

FAQ

Is "serverless" the same as "FaaS"?

Functions-as-a-service is the specific form this page covers - Lambda, Azure Functions, Cloud Functions. Serverless is the marketing umbrella that also includes serverless containers (Cloud Run, Fargate, Container Apps), serverless databases (DynamoDB on-demand, Aurora Serverless, Cloud Spanner), and a growing list of "no infrastructure to manage" services. This page is FaaS-specific; the Containers page covers the serverless-container variants.

Should I write everything as serverless functions?

No. Functions are great for event-driven, intermittent, mostly stateless workloads that finish in under 15 minutes. They get expensive at high sustained throughput, awkward for long-running work, and lock you to one cloud's event-source wiring. The "rebuild everything as Lambdas" pattern from 2018 burned a lot of teams - most have since migrated steady-state services back to containers, kept serverless for the parts where it actually fits.

How is cold-start security-relevant?

Two ways. First, every cold start re-loads all dependencies - meaning a compromised dependency activates on every cold start (not just at deploy). Second, init-time code paths (loading secrets, opening connections) run differently from steady-state warm code paths - bugs that only happen during init can be harder to catch.

Can a function be exploited like a regular web app?

Yes. Injection, deserialization, broken access control, SSRF - every OWASP-class issue applies. The execution environment is shorter-lived and the IAM blast radius is usually narrower, but the application-layer surface is the same. Add event-source-specific injection as an extra category that doesn't exist for traditional servers.

How does this relate to zero trust?

Serverless makes some zero-trust principles trivially easy and others harder. Per-invocation isolation and per-function identity are zero trust by default - "verify explicitly" and "least privilege" come almost for free. What's harder: network-level segmentation (the platform abstracts the network away) and continuous verification (no host to inspect, no agent to run, so you depend on the platform's telemetry and your own logging).

Where next