Kubernetes Security Best Practices: A Defender's Guide
Kubernetes security best practices are the controls a team applies across the control plane, workloads, network, secrets, and runtime to protect a cluster at every layer an attacker can move through.
A default Kubernetes cluster is a flat, trusting network with a single God-mode API. Every pod can reach every other pod. Every namespace can run a root container that mounts the host filesystem unless something stops it. A leaked service-account token is a kubectl session against the control plane, and that control plane keeps every secret in plaintext inside etcd until you tell it otherwise. None of that is a bug. It is what you get when you kubeadm init and walk away, and it is the starting position an attacker inherits the moment they land one pod.
Kubernetes security best practices are the controls that close those defaults across the layers an attacker actually moves through: the control plane that runs the cluster, the workloads it schedules, the network between them, the secrets they read, and the runtime where compromise plays out. This is the practical companion to the framework article. If you are choosing a hardening baseline, the Kubernetes frameworks NIST vs CIS comparison covers which standard to measure against; this piece is the field manual for the specific controls a defender applies and why each one cuts risk.
How to think about Kubernetes security best practices
Kubernetes is not one system to secure. It is a control plane plus a fleet of nodes plus everything scheduled on them, and an attacker does not respect those boundaries, they pivot across them. So the practices group by the layer where the risk lives: the control plane (the API server, etcd, the kubelets), the workloads (what a pod is allowed to be and do), the network (who can talk to whom), the supply chain and secrets (what runs and what it reads), and runtime (what you see while it happens). A control at one layer does not cover a gap at another. Locked-down RBAC does nothing for a pod that runs privileged and mounts the host. A hardened pod does nothing if any namespace can reach it over a flat network.
Two properties make this its own discipline. First, the API server is a single point of total control: anything that can talk to it with enough rights owns the cluster, so identity and authorization are the real perimeter, not the network edge. Second, nodes share a kernel with the pods they run, the same property that makes container security its own problem, so a pod that escapes lands on the node and every workload on it. Best practices answer both: shrink what each identity and each pod is allowed to do, then watch what they actually do.
The layers below are ordered by blast radius. The control plane comes first because it owns everything else.
Lock down the control plane
The control plane is the cluster. The API server authenticates and authorizes every action, etcd stores all state including secrets, and the kubelet on each node executes what the API server schedules. Compromise any of the three and the workloads stop mattering.
Why it matters. A single over-permissive role, an exposed API endpoint, or an unauthenticated kubelet is not one workload at risk, it is the whole cluster. RBAC that binds cluster-admin to a service account hands that account the keys. An etcd reachable without mutual TLS is every secret in cleartext to whoever finds it.
How to do it.
- Apply least-privilege RBAC. Grant the narrowest role that works, scope it to a namespace with a
Rolerather than a cluster-wideClusterRolewherever possible, and never bindcluster-adminto workloads or to thedefaultservice account. Audit who cancreate pods,escalate, orbindroles, because those verbs are paths to more privilege. - Disable default service-account token automounting. Set
automountServiceAccountToken: falseon pods that never call the API. A mounted token is a credential sitting in the pod waiting to be stolen. - Encrypt etcd at rest and lock its access. Configure an
EncryptionConfigurationso secrets are not stored in plaintext, require mutual TLS for every etcd client, and keep etcd off the public network. etcd is the crown jewels datastore; treat it like one. - Protect the API server and kubelets. Disable anonymous authentication, require authentication on the kubelet API, never expose the API server or the dashboard to the public internet without authentication, and turn on API server audit logging so every privileged action leaves a record.
Harden workloads with Pod Security Admission
A pod is a request to run a process on a node, and by default that request can ask for almost anything: root, host namespaces, the host filesystem, every Linux capability. Workload hardening is where you say no before the pod schedules.
Why it matters. The node kernel is the only boundary between a pod and the host. A pod that runs as root, sets privileged: true, mounts hostPath: /, or shares the host PID namespace is one process bug away from owning the node and every neighbor on it. Enforcing this per-pod by hope does not scale; it has to be enforced by the cluster.
How to do it.
- Enforce Pod Security Standards with Pod Security Admission. PodSecurityPolicy was removed in Kubernetes v1.25; the built-in replacement is Pod Security Admission, which applies one of three Pod Security Standards profiles (
privileged,baseline,restricted) per namespace via labels. Default namespaces torestrictedand grant exceptions deliberately. - Set a hard securityContext on every pod.
runAsNonRoot: truewith a specific UID,readOnlyRootFilesystem: true,allowPrivilegeEscalation: false, andcapabilities.drop: ["ALL"]with only the few a workload truly needs added back. These are the same least-privilege directives that contain any container, made non-optional by admission. - Forbid host access. No
privileged: true, nohostNetwork,hostPID, orhostIPC, nohostPathmounts to sensitive paths, and never the container runtime socket. Each is a direct path off the pod and onto the node. - Use a policy engine for rules the built-in standards miss. Pod Security Admission covers the common cases; an admission controller like Kyverno or OPA Gatekeeper enforces the rest, such as required labels, allowed registries, or signature verification, before a workload is admitted.
Segment the network with policies
Out of the box, Kubernetes networking is flat: every pod can open a connection to every other pod in the cluster, across namespaces. That is the opposite of what you want when one pod is compromised.
Why it matters. A flat network turns a single foothold into cluster-wide reach. An attacker who lands in a front-end pod can scan and pivot to the database, the metadata service, and the API server with nothing in the way. Segmentation is what converts one compromised pod from a cluster problem into a contained one, and it is the single most effective brake on lateral movement inside a cluster.
How to do it.
- Default to deny. Apply a default-deny
NetworkPolicyper namespace for both ingress and egress, then allow only the specific flows each service needs. A pod that can only reach what it must reach cannot be used to reach everything. - Control egress, not just ingress. Restricting outbound traffic blocks command-and-control callbacks and data exfiltration from a compromised pod. Most teams write ingress rules and forget egress; the egress rule is the one that stops the beacon.
- Lock down the cloud metadata endpoint. Block pod access to the node's instance metadata service (such as
169.254.169.254), a common path to steal node cloud credentials and escalate out of the cluster entirely. - Confirm your CNI enforces policies. A
NetworkPolicyis inert unless the CNI plugin (Calico, Cilium, and others) actually enforces it. Verify enforcement; a policy the network ignores is documentation, not a control.
Secure the supply chain and secrets
Two questions decide whether the cluster runs what you intended: where do images come from, and how do workloads get their credentials. Get either wrong and the hardening above protects a malicious workload or leaks the keys to it.
Why it matters. If an attacker can push to a registry the cluster pulls from, or substitute a look-alike image, your runtime controls guard their code perfectly. And a cluster that stores secrets in plaintext, or hands every pod a static long-lived credential, gives an intruder the next hop for free.
How to do it.
- Admit only trusted, verified images. Pull from private registries you control, pin images by
sha256digest rather than mutable tags, and use admission control to verify image signatures (for example with Sigstore cosign) so an unsigned or unknown image never schedules. This is the cluster-side enforcement point against a poisoned or substituted artifact in the software supply chain. - Scan images and fail the pipeline. Block critical and high CVEs before the image reaches the registry, and rescan stored images on new disclosures. The cheapest place to stop a vulnerable workload is before it deploys.
- Encrypt secrets and prefer external secret stores. Kubernetes
Secretobjects are base64-encoded, not encrypted, until you enable etcd encryption at rest. For sensitive credentials, pull from an external manager (such as HashiCorp Vault or a cloud KMS) and inject at runtime instead of baking secrets into manifests or images. - Use short-lived, scoped credentials. Bind workloads to cloud IAM through workload identity rather than static keys, scope each service account to one job, and rotate. A credential that expires on its own is one an attacker cannot reuse next week.
Detect and respond at runtime
Hardening lowers the odds of compromise; it does not make it impossible, and Kubernetes is fast and ephemeral enough that a missed event is gone for good. Runtime detection is how you see the attempt and the breakout that succeeds.
Why it matters. A pod that lives for ninety seconds and then gets rescheduled leaves no host to image after the fact. If nothing is watching process, file, and network behavior while the pod runs, the first sign of compromise is the outcome, not the intrusion.
How to do it.
- Run a runtime sensor. A behavioral runtime tool (open-source Falco, or a commercial CWPP agent) flags the anomalies that signal compromise: a shell spawned in a container that should never exec one, an unexpected outbound connection, a write to a read-only path, a process escaping its profile.
- Centralize the logs that disappear with the node. Ship API server audit logs, kubelet logs, and container stdout off the node to a SIEM, because the node and the pod are both ephemeral. Audit logs are also where you reconstruct who did what to the control plane after an incident.
- Build Kubernetes-aware response playbooks. Cordon the node, isolate the pod with a deny-all
NetworkPolicy, and capture the container filesystem and memory before the scheduler kills and replaces it. The on-host runbook does not fit a workload that is already gone. - Layer detection as defense in depth. Runtime detection is the last layer of defense in depth, the one that assumes every control before it can fail and watches for the case where one did.
Govern continuously: patch, benchmark, audit
A cluster is hardened on Tuesday and drifting by Friday. New workloads ship, RBAC accretes, a node falls behind on patches. Governance is the practice that keeps the other five from rotting.
Why it matters. Configuration drift is silent. The over-permissive role someone added for a deadline, the namespace that never got a network policy, the control plane two minor versions behind on CVEs, none of these announce themselves. Without continuous checking, your real posture diverges from your intended one until an attacker finds the gap.
How to do it.
- Benchmark against a recognized standard. Run the cluster against the CIS Kubernetes Benchmark (with
kube-benchor equivalent) on a schedule and treat regressions as findings, not noise. The benchmark is the consensus checklist for the control plane, nodes, and policies. - Patch the control plane and nodes on a cadence. Kubernetes ships frequent releases with security fixes; an unpatched API server or kubelet is a published exploit waiting to be used. Keep within the supported version skew and patch nodes, not just the control plane.
- Continuously scan posture and entitlements. A posture tool (such as a CSPM or CNAPP) catches the misconfiguration and the over-broad permission that drift in between benchmark runs, and maps which identities can actually reach which resources.
- Review RBAC and policy regularly. Prune unused roles and bindings, re-confirm that
restrictedis still the namespace default, and verify network policies still match the services they protect. Least privilege is a state you maintain, not one you set once.
Kubernetes security best practices at a glance
The practices above, condensed by layer. Each is a control you apply and keep applying, not a task you finish once.
| Layer | Best practice | Why it matters | How to do it |
|---|---|---|---|
| Control plane | Least-privilege RBAC, lock etcd and the API | The API server and etcd own the whole cluster | Scope roles, no cluster-admin on workloads, encrypt etcd, no anonymous auth, audit logs |
| Workloads | Pod Security Admission and a hard securityContext | The node kernel is all that separates pod and host | restricted profile, runAsNonRoot, drop ALL caps, no privileged or hostPath |
| Network | Default-deny network policies | A flat network turns one foothold into cluster-wide reach | Deny ingress and egress by default, block metadata endpoint, verify the CNI enforces |
| Supply chain and secrets | Trusted images, encrypted scoped secrets | A bad image or a leaked secret bypasses runtime defenses | Signed digest-pinned images, scan and fail the build, external secret store, short-lived creds |
| Runtime | Detect and respond | Ephemeral pods erase evidence | Runtime sensor, ship audit and container logs to a SIEM, Kubernetes-aware playbooks |
| Governance | Patch, benchmark, audit continuously | Configuration drifts silently | CIS benchmark on a cadence, patch control plane and nodes, posture and RBAC review |
Work from the top down. The control plane has the largest blast radius, and a gap there cannot be covered by any control below it.
Frequently Asked Questions
What are the most important Kubernetes security best practices?
The highest-leverage practices are locking down the control plane with least-privilege RBAC and an encrypted etcd, hardening workloads with Pod Security Admission and a strict securityContext (non-root, drop all capabilities, no privileged), segmenting the network with default-deny policies, admitting only signed scanned images, encrypting and scoping secrets, and running runtime detection. Control-plane controls come first because the API server owns the entire cluster.
What replaced PodSecurityPolicy in Kubernetes?
PodSecurityPolicy was deprecated in Kubernetes v1.21 and removed in v1.25. The built-in replacement is Pod Security Admission, which enforces one of three Pod Security Standards profiles (privileged, baseline, or restricted) at the namespace level using labels. For rules the standards do not cover, a policy engine like Kyverno or OPA Gatekeeper enforces custom admission policies.
How do I stop lateral movement inside a Kubernetes cluster?
By default every pod can reach every other pod, so a compromised pod can pivot freely. Apply a default-deny NetworkPolicy per namespace for both ingress and egress and allow only the flows each service needs, block pod access to the cloud metadata endpoint, and confirm your CNI plugin actually enforces network policies. Controlling egress is what stops command-and-control callbacks and data exfiltration.
Should Kubernetes pods run as root?
No. A pod running as root is one process bug away from root on the node, because the node kernel is the only boundary. Set runAsNonRoot: true with a specific UID, readOnlyRootFilesystem: true, allowPrivilegeEscalation: false, and drop all Linux capabilities, then enforce it with Pod Security Admission set to restricted. Never set privileged: true or mount the container runtime socket.
How are Kubernetes secrets secured?
By default a Kubernetes Secret is only base64-encoded, not encrypted, and is stored in etcd. Enable encryption at rest with an EncryptionConfiguration so secrets are not in plaintext, lock etcd behind mutual TLS and keep it off the public network, and disable service-account token automounting for pods that do not call the API. For sensitive credentials, pull from an external manager such as Vault or a cloud KMS and use short-lived, scoped, rotated credentials.
What is the difference between securing the cluster and securing the workloads?
Securing the cluster is the control plane and platform: RBAC, etcd, the API server and kubelets, patching, and benchmarking. Securing the workloads is what each pod is allowed to be and do: non-root, dropped capabilities, no host access, network policies, and trusted images. Both are required, because locked-down RBAC does nothing for a privileged pod, and a hardened pod does nothing on a wide-open control plane.
The bottom line
Kubernetes security best practices are not a product or a one-time scan. They are controls applied at each layer an attacker moves through: a locked-down control plane with least-privilege RBAC and encrypted etcd, workloads constrained by Pod Security Admission, a default-deny network, signed images and scoped secrets, runtime detection for what hardening misses, and continuous governance over all of it. The order follows blast radius, the control plane first, because a gap there cannot be patched from below.
None of these are exotic. They are defaults reversed: root turned off, capabilities dropped, pods denied each other by default, tokens un-mounted, etcd encrypted, which is exactly why getting them right separates a defensible cluster from one stolen token away from cluster-admin. The work is continuous because the cluster is: every new workload and every new role is another chance to ship a default back in.
Frequently asked questions
<p>The highest-leverage practices are locking down the control plane with least-privilege RBAC and an encrypted etcd, hardening workloads with Pod Security Admission and a strict securityContext (non-root, drop all capabilities, no privileged), segmenting the network with default-deny policies, admitting only signed scanned images, encrypting and scoping secrets, and running runtime detection. Control-plane controls come first because the API server owns the entire cluster.</p>
<p>PodSecurityPolicy was deprecated in Kubernetes v1.21 and removed in v1.25. The built-in replacement is Pod Security Admission, which enforces one of three Pod Security Standards profiles (<code>privileged</code>, <code>baseline</code>, or <code>restricted</code>) at the namespace level using labels. For rules the standards do not cover, a policy engine like Kyverno or OPA Gatekeeper enforces custom admission policies.</p>
<p>By default every pod can reach every other pod, so a compromised pod can pivot freely. Apply a default-deny <code>NetworkPolicy</code> per namespace for both ingress and egress and allow only the flows each service needs, block pod access to the cloud metadata endpoint, and confirm your CNI plugin actually enforces network policies. Controlling egress is what stops command-and-control callbacks and data exfiltration.</p>
<p>No. A pod running as root is one process bug away from root on the node, because the node kernel is the only boundary. Set <code>runAsNonRoot: true</code> with a specific UID, <code>readOnlyRootFilesystem: true</code>, <code>allowPrivilegeEscalation: false</code>, and drop all Linux capabilities, then enforce it with Pod Security Admission set to <code>restricted</code>. Never set <code>privileged: true</code> or mount the container runtime socket.</p>
<p>By default a Kubernetes <code>Secret</code> is only base64-encoded, not encrypted, and is stored in etcd. Enable encryption at rest with an <code>EncryptionConfiguration</code> so secrets are not in plaintext, lock etcd behind mutual TLS and keep it off the public network, and disable service-account token automounting for pods that do not call the API. For sensitive credentials, pull from an external manager such as Vault or a cloud KMS and use short-lived, scoped, rotated credentials.</p>
<p>Securing the cluster is the control plane and platform: RBAC, etcd, the API server and kubelets, patching, and benchmarking. Securing the workloads is what each pod is allowed to be and do: non-root, dropped capabilities, no host access, network policies, and trusted images. Both are required, because locked-down RBAC does nothing for a privileged pod, and a hardened pod does nothing on a wide-open control plane.</p>