Container Security Best Practices: A Defender's Guide
Container security best practices are the controls a team applies across build, ship, run, and the orchestrator to protect containerized workloads at every stage they can be attacked.
Exec into a random pod in most clusters and the same picture comes back. The process runs as UID 0. The root filesystem is writable. The base image is a full ubuntu:latest with a package manager, a shell, and three hundred libraries the application never calls. A service-account token is mounted at /var/run/secrets whether the workload needs the API server or not, and nothing is watching what the container does after it starts. None of that took an exploit. Each is a default left in place, and each is the thing an attacker uses after the first foothold.
Container security best practices are the controls that remove those defaults, applied at every stage a container passes through: the image you build, the registry you ship it from, the host and runtime where it executes, and the orchestrator that schedules it. This guide is the best-practices companion to the concept article. It assumes you already know what container security is and why the shared kernel makes it its own discipline; if not, start with container security for the build-ship-run model and NIST SP 800-190. This piece is the field manual: the specific, durable practices a defender applies, why each one cuts risk, and how to do it.
How to think about container security best practices
A container is not one thing to secure. It is an artifact that moves through a pipeline, and each stage has its own attack surface, so the practices group by stage: build, ship, run, plus the orchestrator that spans all three. The reason to keep the stages separate is that a control at one stage cannot cover a gap at another. A hardened runtime does nothing about a critical CVE baked into the base layer at build time. A scanned, signed image is no help if the running container holds every Linux capability and a path to the host. The practices below close each gap where it opens.
Two properties make this urgent. First, the container shares the host kernel, so a contained workload that breaks out lands on the node and every other container on it, with no hypervisor in the way. Second, containers are dense and ephemeral: one bad base image gets deployed a thousand times, and a workload that lived for ninety seconds is gone before anyone reads its logs. Best practices answer both. Shrink what each container can do, and capture what it did while it ran.
The practices are ordered roughly by where risk enters the pipeline, build first. Earlier is cheaper: a flaw caught at build is a failed pipeline, the same flaw caught at runtime is an incident.
Build minimal images from trusted bases
Most container risk is introduced at build time, because everything that runs in production was put in the image by someone. The image is the first and highest-leverage place to cut it.
Why it matters. An image is a stack of layers, and the bottom layer is a base you did not write. Pull a full-OS base and you inherit every package in it, most of which your application never uses, each one a potential CVE and each one more for an attacker to live off after a compromise. A shell and a package manager in the image are tools you handed the intruder.
How to do it.
- Start from a minimal, trusted base. Distroless, Alpine, or a
scratchimage with only the binary and its runtime. Pull from a verified, pinned source, never an arbitrary publiclatesttag. Less in the image means less to exploit and less to patch. - Pin by digest, not by tag. A tag like
latestor1.2is mutable and can be repointed under you. Pin the base and every dependency to an immutablesha256digest so the build is reproducible and cannot be swapped. - Add nothing you do not need at runtime. Use multi-stage builds so compilers, test tools, and build secrets stay in the build stage and never reach the final image.
- Never bake secrets into a layer. A token or key added in one layer and deleted in the next is still in the image history. Inject secrets at runtime, and scan images to catch the ones that slip in.
Scan for vulnerabilities and shift left
A minimal base reduces the surface; scanning tells you what is still on it. The point is to find known-vulnerable packages before they run, not after.
Why it matters. Public CVEs are the cheapest exploit an attacker has, because the work is already published. An image with a critical CVE in a library is a known door, and it ships to every replica the moment you deploy. Catching it in the pipeline costs a rebuild; catching it in production costs an incident and a fleet-wide redeploy.
How to do it.
- Scan in the pipeline and fail the build on critical and high findings, so a vulnerable image never reaches the registry. This is the shift-left move: the check runs where the fix is cheapest. Container image scanning is a vulnerability management practice applied to the artifact, not a one-time gate.
- Scan the registry continuously, too. A CVE disclosed tomorrow makes an image you scanned clean today vulnerable. Rescan stored images on new disclosures so yesterday's pass does not become today's blind spot.
- Prioritize by reachability, not raw CVSS. A critical CVE in a package the workload never loads is lower risk than a medium one on the running path. Use a scanner that accounts for whether the vulnerable code is actually exercised.
- Generate an SBOM. A software bill of materials lists every component in the image, so when the next widely-used library has a critical flaw you can answer "are we affected?" in minutes instead of days.
Sign images and secure the supply chain
Scanning proves an image is clean. Signing proves it is the one you built and nobody swapped it. Both are needed, because the registry and the path to it are themselves attack surface.
Why it matters. If an attacker can push to your registry or get a node to pull a look-alike, your runtime controls protect a malicious workload perfectly. A poisoned base image or a compromised CI step is a software supply chain attack: the artifact is trusted by construction, so it sails past defenses aimed at external threats.
How to do it.
- Sign images at build and verify at deploy. Use a signing tool such as Sigstore cosign to attach a cryptographic signature, and configure the cluster to admit only signed images from your registries. An unsigned or unverifiable image does not run.
- Lock the registry down. Private registries, least-privilege pull and push credentials, and no anonymous access. The registry holds every artifact you run; treat it like the sensitive system it is.
- Harden the pipeline that produces the image. Scope CI credentials tightly, isolate build runners, and pin the actions and base images the pipeline itself uses. A build system an attacker can edit is a build system that signs malware for you.
- Use admission control as the enforcement point. A policy controller (such as a Kubernetes admission webhook) is where "only signed, scanned, non-root images deploy" stops being a guideline and becomes a rule the cluster enforces.
Run least privilege at runtime
Everything above shrinks what is in the image. Runtime configuration shrinks what the running container can do, which is what decides whether one compromised process stays contained or becomes a node takeover.
Why it matters. The kernel boundary is the only thing between a container and the host. A container running as root, with extra Linux capabilities, a writable filesystem, or access to the host namespaces, hands an attacker the tools to attack that boundary. A --privileged container is barely a boundary at all.
How to do it.
- Run as a non-root user with a read-only root filesystem. Set
runAsNonRootand a specific UID, and mount writable space only where the app genuinely needs it. Most workloads never need to write to their own image. - Drop all capabilities, add back only what is required. Start from
drop: ALLand grant the specific capability a workload needs, if any. Never run--privileged, and never mount the Docker socket into a container, that is a direct path to the host. - Prevent privilege escalation. Set
allowPrivilegeEscalation: falseand apply seccomp, AppArmor, or SELinux profiles to constrain the syscalls and resources a container can reach. - Set resource limits. CPU and memory limits on every workload keep one compromised or runaway container from starving its neighbors and turning a foothold into a denial of service across the node.
Detect and respond at runtime
Hardening reduces the odds; it does not make breakout impossible, and a determined attacker will probe. Runtime detection is how you see the attempt and the workload that succeeds, in an environment too ephemeral for after-the-fact forensics alone.
Why it matters. A container that lives for two minutes leaves no host to image once it is gone. If you are not watching process, file, and network behavior while it runs, the evidence disappears with the workload, and the first sign of compromise becomes the ransom note.
How to do it.
- Watch runtime behavior. A runtime sensor (commercial CWPP agents or open-source tooling like Falco) flags the anomalies that signal compromise: a shell spawned in a container that should never exec one, an unexpected outbound connection, a write to a path that should be read-only, a process escaping its expected profile.
- Stream container and orchestrator logs off the node to a central SIEM, because the node and the workload are both ephemeral. Logs left on a node that autoscaling reclaims are logs you do not have during the investigation.
- Build container-aware response playbooks. Cordon the node, isolate the pod with a deny-all network policy, and capture the container filesystem and memory before the scheduler kills and replaces it. The on-host playbook does not fit a workload that is already gone.
- Enforce immutability at runtime. A running container should not change. Tools that block or alert on in-container package installs and binary drops turn "the attacker modified the workload" into a detection rather than a surprise.
Harden the orchestrator
In production the orchestrator is usually Kubernetes, and it is its own attack surface on top of the containers it runs. A perfectly hardened pod in a wide-open cluster is still exposed through the control plane.
Why it matters. The Kubernetes API server is the control plane for everything: an over-permissive role, an exposed dashboard, or an unauthenticated kubelet hands an attacker the whole cluster, not one workload. Pods talk to every other pod by default, so one compromised container can reach the entire mesh with no segmentation in the way.
How to do it.
- Lock down API access with RBAC. Grant least-privilege roles, never bind
cluster-adminto workloads or default service accounts, and disable the auto-mount of service-account tokens for pods that do not call the API. - Segment with network policies. Default to deny pod-to-pod traffic and allow only the specific flows each service needs, so a compromised pod cannot move laterally across the cluster freely.
- Enforce pod security at admission. Apply Pod Security Standards (or a policy engine like OPA Gatekeeper or Kyverno) to reject privileged pods, host-namespace sharing, and root containers before they schedule. This is where the runtime least-privilege rules above become non-optional.
- Keep the platform patched and benchmarked. Run the cluster against the CIS Kubernetes Benchmark, patch the control plane and nodes on a schedule, and protect etcd, the datastore that holds every secret, with encryption at rest and tight access.
Container security best practices at a glance
The practices above, condensed by stage. Each is a control you apply continuously, not a task you finish once.
| Stage | Best practice | Why it matters | How to do it |
|---|---|---|---|
| Build | Minimal trusted images | Inherited packages are inherited CVEs | Distroless/Alpine, pin by digest, multi-stage, no baked secrets |
| Build | Scan and shift left | CVEs ship to every replica at deploy | Fail the build on critical/high, rescan the registry, SBOM |
| Ship | Sign and secure the supply chain | A swapped or poisoned artifact bypasses runtime defenses | Sign and verify, lock the registry, harden CI, admission control |
| Run | Least privilege at runtime | The kernel boundary is all that separates pod and host | Non-root, read-only FS, drop capabilities, no privileged, seccomp |
| Run | Detect and respond | Ephemeral workloads erase evidence | Runtime sensor, ship logs to a SIEM, container-aware playbooks |
| Orchestrator | Harden Kubernetes | The control plane owns the whole cluster | RBAC, network policies, Pod Security Standards, CIS benchmark, patch |
Work left to right. Risk enters earliest at build, and the cost of fixing it rises at every stage it survives.
Frequently Asked Questions
What are the most important container security best practices?
The highest-leverage practices are building minimal images from trusted, digest-pinned bases, scanning for vulnerabilities in the pipeline and failing the build on critical findings, signing images and verifying them at deploy, running containers as non-root with least privilege and a read-only filesystem, detecting threats at runtime, and hardening the orchestrator with RBAC and network policies. Build-stage controls come first because risk is cheapest to fix there.
How is container security different from traditional VM security?
A container shares the host kernel instead of running its own, so a breakout lands directly on the node and every other container on it, with no hypervisor boundary in between. Containers are also dense and short-lived: one bad image is deployed many times, and a workload can vanish before its logs are read. That makes minimal images, strict runtime least privilege, and runtime detection more central than they are for long-lived VMs.
Should containers run as root?
No. Running as root means a process that escapes the container starts with root on the path to the host. Run as a specific non-root UID, set runAsNonRoot, use a read-only root filesystem, drop all Linux capabilities and add back only what is required, and set allowPrivilegeEscalation: false. Never run a --privileged container or mount the Docker socket into one.
What is image scanning and where should it happen?
Image scanning inspects an image's layers and packages for known vulnerabilities and embedded secrets. It should happen in the build pipeline, where you fail the build on critical and high findings before the image reaches the registry, and continuously against stored images, because a CVE disclosed after you scanned can turn a clean image vulnerable. Prioritize findings by whether the vulnerable code is actually reachable, not by raw CVSS alone.
Why do I need to sign container images?
Scanning proves an image is clean; signing proves it is the exact artifact you built and that nobody swapped it on the way to the cluster. Without verification, an attacker who can push to your registry or redirect a pull gets your runtime controls protecting a malicious workload. Sign at build with a tool like cosign and configure admission control to admit only signed, verified images.
How do I secure a Kubernetes cluster running containers?
Lock down the API server with least-privilege RBAC and never bind cluster-admin to workloads, default pod-to-pod traffic to deny and open only required flows with network policies, enforce Pod Security Standards at admission to reject privileged and root pods, disable unused service-account token mounts, run the cluster against the CIS Kubernetes Benchmark, patch the control plane and nodes, and encrypt etcd at rest.
The bottom line
Container security best practices are not a product or a single scan. They are a set of controls applied at each stage a container moves through: a minimal, signed, scanned image at build and ship time, a non-root, least-privilege container at runtime, runtime detection for what hardening does not stop, and a locked-down orchestrator over all of it. The order matters because cost does: a flaw caught at build is a red pipeline, the same flaw at runtime is an incident.
None of these are exotic. They are defaults reversed, root turned off, capabilities dropped, images pinned and signed, pods denied each other by default, which is exactly why getting them right separates a defensible cluster from one breach away from the node. The work is continuous because the pipeline is: every new image and every new disclosure is another chance to ship the default back in.
Frequently asked questions
<p>The highest-leverage practices are building minimal images from trusted, digest-pinned bases, scanning for vulnerabilities in the pipeline and failing the build on critical findings, signing images and verifying them at deploy, running containers as non-root with least privilege and a read-only filesystem, detecting threats at runtime, and hardening the orchestrator with RBAC and network policies. Build-stage controls come first because risk is cheapest to fix there.</p>
<p>A container shares the host kernel instead of running its own, so a breakout lands directly on the node and every other container on it, with no hypervisor boundary in between. Containers are also dense and short-lived: one bad image is deployed many times, and a workload can vanish before its logs are read. That makes minimal images, strict runtime least privilege, and runtime detection more central than they are for long-lived VMs.</p>
<p>No. Running as root means a process that escapes the container starts with root on the path to the host. Run as a specific non-root UID, set <code>runAsNonRoot</code>, use a read-only root filesystem, drop all Linux capabilities and add back only what is required, and set <code>allowPrivilegeEscalation: false</code>. Never run a <code>--privileged</code> container or mount the Docker socket into one.</p>
<p>Image scanning inspects an image's layers and packages for known vulnerabilities and embedded secrets. It should happen in the build pipeline, where you fail the build on critical and high findings before the image reaches the registry, and continuously against stored images, because a CVE disclosed after you scanned can turn a clean image vulnerable. Prioritize findings by whether the vulnerable code is actually reachable, not by raw CVSS alone.</p>
<p>Scanning proves an image is clean; signing proves it is the exact artifact you built and that nobody swapped it on the way to the cluster. Without verification, an attacker who can push to your registry or redirect a pull gets your runtime controls protecting a malicious workload. Sign at build with a tool like cosign and configure admission control to admit only signed, verified images.</p>
<p>Lock down the API server with least-privilege RBAC and never bind cluster-admin to workloads, default pod-to-pod traffic to deny and open only required flows with network policies, enforce Pod Security Standards at admission to reject privileged and root pods, disable unused service-account token mounts, run the cluster against the CIS Kubernetes Benchmark, patch the control plane and nodes, and encrypt etcd at rest.</p>