Glossary/Detection Engineering/Cloud Security Best Practices

Cloud Security Best Practices: A Defender's Guide

Cloud security best practices are the durable, provider-neutral controls a customer configures to protect the workloads, data, and identities they run in the cloud.

Pull the findings from any cloud posture scanner on an environment that has run for a year and you see the same short list every time. A storage bucket readable by anyone who knows the name. An identity with a wildcard policy that can do nearly everything. A security group open to 0.0.0.0/0 on a database port. A log source that was never turned on, so the one intrusion that mattered left nothing to investigate. None of these are exploits. Each is a setting left at its default, and each is how most cloud breaches actually start.

Cloud security best practices are the durable controls that close those gaps, the ones that hold no matter which provider you run on. They are not a product, and not a checklist you finish once. They are a small set of disciplines (identity, encryption, configuration, logging, network, workload, automation, response) applied continuously to an environment that changes by API call every minute. This guide covers each one as a defender sees it: what the practice is, why it reduces risk, and how to do it. It is written for the people who inherit a cloud account and have to make it defensible: SOC analysts, threat hunters, and incident responders. For the AWS-specific version of this list, see AWS cloud security best practices; this article is the provider-neutral parent.

What cloud security best practices actually are

Cloud Security Best Practices
Ten provider-neutral disciplines
All on the customer side of the shared responsibility line. Roughly ordered by blast radius.
01
Shared responsibility
Know which controls are yours: data, identity, config.
02
Least-privilege IAM + MFA
Identity is the perimeter. Scope every role, require MFA.
03
Encryption + key management
At rest and in transit, customer-managed keys.
04
Fix misconfigurations
CIS Benchmarks, CSPM. The top cause of breaches.
05
Logging + monitoring
Audit logs everywhere, centralized to a SIEM.
06
Network segmentation
Private by default, no 0.0.0.0/0 on management ports.
07
Workload protection
Scan images, immutable patching, run a CWPP.
08
Automate guardrails
Scan IaC in the pipeline, preventive policy-as-code.
09
Cloud incident response
Immutable logs, isolation playbooks, forensics account.
10
Continuous posture
Score against the benchmark, manage exposure, close drift.
The pattern Identity and configuration come first because they cause the most breaches. None of these are exotic. They are defaults left unchanged, applied continuously to an environment that changes by the minute.

Cloud security is the set of controls a customer configures to protect the workloads, data, and identities they run in a provider's environment. Best practices are the subset that the major providers, the CIS Benchmarks, and breach post-mortems agree reduce the most risk for the least effort. They are provider-neutral on purpose: the console and API names differ between AWS, Azure, and Google Cloud, but every provider has an identity service, an encryption service, an audit log, and a posture tool. The practice is what you do with them.

Two properties of the cloud make this its own discipline rather than data-center security with new logos. First, identity is the perimeter: every API call is authenticated by a credential and reachable from anywhere, so a stolen key puts an attacker inside with no network boundary to cross. Second, configuration is the attack surface: resources are created by API call in seconds, and one wrong setting exposes data instantly. The most common cause of a cloud breach is a setting, not a zero-day. The practices below exist to close both gaps.

Start with the shared responsibility model

Before any specific control, know which controls are yours. Every major provider publishes its own shared responsibility model, and the shorthand holds across all of them: the provider is responsible for security *of* the cloud, the customer for security *in* the cloud. The provider owns the physical data center, hardware, hypervisor, and the internals of its managed services. You own your data, your identities and policies, and your configuration.

Where the line falls depends on the service model, which NIST defines in SP 800-145: Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). The more managed the service, the more the provider takes on.

LayerIaaS (virtual machines)PaaS (managed runtime)SaaS (managed app)
Physical, hardware, hypervisorProviderProviderProvider
Operating system, patchingCustomerProviderProvider
Runtime, middlewareCustomerProviderProvider
Network configurationCustomerCustomer (limited)Provider
Identity and accessCustomerCustomerCustomer
Application codeCustomerCustomerProvider
Data and classificationCustomerCustomerCustomer

Three things never leave the customer regardless of service model: your data, your identities and policies, and your configuration. Every practice below sits on the customer side of that line. "We run in the cloud, so we are secure" is a category error: the provider's infrastructure can be hardened while your account sits wide open.

Enforce least-privilege IAM and multi-factor authentication

Identity is the perimeter, so most of your security effort belongs here. The recurring failure modes are over-privileged identities and weak or unprotected credentials: a role that can do far more than its workload needs, a login with no second factor, an access key that never expires and leaks into code.

Why it matters. Once an attacker holds a valid credential, the control plane sees normal activity, because to the provider it is normal activity. An over-broad policy turns one stolen credential into lateral movement and privilege escalation: an identity that can pass another role, or edit its own policy, is an administrative path waiting to be walked.

How to do it.

  • Grant least privilege. Scope every identity to the specific actions and resources it needs, default to deny, and avoid wildcards. Use the provider's access analyzer to generate fine-grained policies from real activity.
  • Prefer temporary credentials over long-lived keys. Humans federate and assume short-lived roles; workloads use attached service identities, never embedded keys. Reserve and rotate static keys only for cases that cannot use a role.
  • Require MFA everywhere, preferring phishing-resistant factors (hardware keys, passkeys) for privileged access. A second factor is the single control that most often stops a stolen password from becoming an intrusion.
  • Protect the highest-privilege accounts most. The root or global-administrator account governs everything and often cannot be constrained by ordinary policy. Lock its credentials away, put MFA on it, do not use it for daily work.
  • Remove what is unused. Permissions, identities, and keys that have gone cold are pure risk. Strip them on a schedule using last-accessed data.

Encrypt data at rest and in transit, and manage the keys

Encryption is the control that limits blast radius. When a misconfiguration or a leaked credential exposes a data store anyway, encryption decides whether the attacker gets data or ciphertext.

Why it matters. Most managed data services can encrypt at rest with almost no effort, and several default to it. Leaving it off, or managing keys carelessly, throws away the one control that contains a breach after other controls have failed. The deeper topic is covered in the cloud encryption breakdown; the practice itself is short.

How to do it.

  • At rest: enable encryption on every store (object storage, block volumes, databases, snapshots, backups). Use the provider's key management service and prefer customer-managed keys for sensitive data, so you control the policy, rotation, and who can decrypt.
  • In transit: require TLS everywhere, enforced with resource policies that reject unencrypted connections rather than merely discouraging them.
  • Manage keys deliberately. A key policy is an access-control decision. Scope who can use versus administer each key, turn on automatic rotation, and log every key use. A key an attacker can use is no protection.

Encryption does not stop the misconfiguration. It makes the misconfiguration survivable.

Harden configuration and fix misconfigurations

Misconfiguration is the dominant cause of cloud incidents, so finding and fixing config drift is its own practice rather than a side effect of the others. A public storage bucket, an open management port, an unencrypted database, a disabled log: each is a setting, found by automated scanners faster than by your own team.

Why it matters. A misconfiguration needs no exploit. The attacker only has to find it, and the internet is scanned continuously. Many of the largest cloud breaches were exactly this: storage left readable, no malware involved.

How to do it.

  • Measure against a consensus baseline. The CIS Benchmarks are consensus-developed, provider-specific secure-configuration baselines, organized into Level 1 (a sensible baseline) and Level 2 (defense in depth). They turn "are we secure?" into pass/fail checks across identity, logging, networking, and storage, and are updated as providers ship new services, so map against the current version.
  • Run a CSPM continuously. Cloud Security Posture Management tools scan your environment against those baselines and your own policy and surface drift the moment it appears. The point is continuous, not a quarterly audit.
  • Treat a failed control as an exposure with a clock on it, not a line item to file later. A public bucket or a disabled audit log is an open door until it is closed.

CSPM finds the misconfiguration; workload protection watches what runs. The two together are often packaged as a CNAPP, and the distinction is worth understanding from the CWPP vs CSPM comparison.

Enable comprehensive logging and monitoring

A cloud account is blind by default. Almost nothing about control-plane activity is recorded until you turn the audit log on, and an intrusion in an unlogged account leaves nothing to investigate.

Why it matters. Detection and forensics in the cloud run entirely on logs. You cannot detect, alert on, or reconstruct what was never recorded, and "we did not notice for months" is what an unlogged account guarantees.

How to do it.

  • Turn on the control-plane audit log in every account and region. It records every API call: role assumptions, policy changes, console logins. On AWS it is CloudTrail; Azure has Activity Logs and Microsoft Entra logs; Google Cloud has Cloud Audit Logs. The practice is the same: capture it everywhere, with tamper-evidence where the provider offers it.
  • Add data-plane and network logs. Flow logs, storage access logs, and DNS logs catch activity the control-plane log does not.
  • Centralize to a SIEM. Forward every account's logs to one place a SOC can query and alert on. Centralized logging is the precondition for every detection you will ever write; scattered per-account logs are not monitoring.
  • Alert on the high-signal events: root or global-admin use, policy changes that widen access, public-exposure changes, logging being disabled, and access from unexpected geographies.

Secure the network

Identity is the primary perimeter, but the network still matters. A workload reachable from the whole internet is exposed before any credential is tried, and a flat network lets one compromised host reach everything.

Why it matters. A security group open to 0.0.0.0/0 on a database or administrative port is one of the most common and most exploited cloud misconfigurations. Segmentation contains a compromise so one foothold does not become the whole environment.

How to do it.

  • Private by default. Place workloads in private subnets with no direct route to the internet. Expose only what must be public (a load balancer) and reach private instances through a managed session service or a bastion, not open SSH or RDP.
  • No 0.0.0.0/0 on management ports. Allow only the ports and source ranges each workload needs. An open port 22, 3389, or database port to the world is an invitation.
  • Use private endpoints so traffic to managed databases and storage never crosses the public internet.
  • Restrict egress, not just ingress. Default-deny outbound limits how a compromised host beacons to command and control or exfiltrates data.

Protect the workloads and automate guardrails

The hosts, containers, and functions that run your code are still attack surface, and manual review does not scale to an environment that changes every minute. These two practices keep the others true at deploy time.

Why it matters. A vulnerable image deployed a hundred times is a hundred vulnerable workloads; autoscaling multiplies exposure rather than removing it. And a human reviewing changes by hand will miss the open port in the hundredth pull request, where code enforcing policy will not.

How to do it.

  • Scan images before they run. Block known-vulnerable images and embedded secrets at the registry, so a bad image never reaches production. Shifting the check left is cheaper than finding the flaw in a running fleet.
  • Patch via immutable infrastructure. Replace instances from a patched image rather than patching in place, so configuration does not drift between hosts. Minimize the base image: fewer packages, fewer vulnerabilities.
  • Run a CWPP for runtime visibility into processes, file integrity, and behavior on instances and containers, the telemetry a host EDR provides on the endpoint.
  • Scan infrastructure-as-code in the pipeline, failing the build on a public bucket, an open security group, or an unencrypted volume before it deploys.
  • Use preventive guardrails, not just detective ones. A service-control or organization policy that refuses to create a public resource stops the misconfiguration rather than reporting it after the fact.

Plan incident response and manage posture continuously

Cloud incident response is not the on-premises playbook with new tool names, and cloud security is not a project with an end date. Evidence is API-driven and ephemeral, and the environment drifts the moment you stop watching.

Why it matters. A compromised instance can be terminated by autoscaling before you image it, and a stolen credential moves at API speed, so the unprepared team loses the evidence while deciding what to do. Meanwhile every control above degrades without enforcement: an account passes the benchmark, then someone opens a port for a demo and forgets it.

How to do it.

  • Know where the evidence lives and keep it immutable. Audit logs, snapshots, and flow logs are your forensic record; retain them in a separate, locked-down forensics account so an attacker who owns production cannot destroy them.
  • Build isolation playbooks. Quarantine a compromised instance with a deny-all security group and snapshot it before termination; disable a suspected credential and revoke its sessions immediately. Practice the cloud-specific scenario before it happens for real.
  • Score continuously against the benchmark, so drift shows up as a failed control rather than a surprise during an incident.
  • Manage exposure from the attacker's view. Track what is actually reachable from the internet and what an attacker could chain from one foothold, and close the highest-blast-radius paths first, with an owner and a deadline on every finding.

Cloud security best practices at a glance

The practices above, condensed. Each line is a discipline you apply continuously, not a task you finish once.

Best practiceWhy it mattersHow to do it
Shared responsibilityTells you which controls are yoursMap services to the IaaS/PaaS/SaaS split; own data, identity, config
Least-privilege IAM + MFAIdentity is the perimeterScope every identity, require MFA, prefer temporary credentials, protect root
Encryption + key managementLimits blast radius when a store is exposedEncrypt at rest and in transit, customer-managed keys, rotate and log key use
Fix misconfigurationsTop cause of cloud breachesMeasure against CIS Benchmarks, run CSPM continuously, close drift fast
Logging and monitoringThe account is blind until logging is onAudit logs everywhere, add flow/DNS logs, centralize to a SIEM
Network segmentationA flat or internet-facing network is exposed firstPrivate by default, no 0.0.0.0/0 on management ports, private endpoints, egress filtering
Workload protection + guardrailsWorkloads are attack surface; review does not scaleScan images, immutable patching, CWPP, scan IaC, preventive guardrails
IR + posture managementEvidence is ephemeral; controls driftImmutable logs, isolation playbooks, a forensics account, score continuously, manage exposure

Work roughly top to bottom. The order tracks blast radius: a compromised identity or a misconfiguration loses the environment, while a single unpatched workload loses one host.

The bottom line

Cloud security best practices are the durable controls that sit on the customer side of the shared responsibility model, applied continuously to an environment that changes by the minute. The provider secures the infrastructure; you secure the identities, configuration, and data on top of it. The practices that move the most risk are the unglamorous ones: a scoped identity with MFA instead of a wildcard policy, encryption with managed keys, a configuration measured against the CIS Benchmarks, audit logs centralized to a SIEM, a private network, and a guardrail that refuses to create a public resource in the first place.

None of these are exotic. They are defaults left unchanged on most accounts, which is exactly why getting them right separates a defensible cloud environment from a breached one. The work is continuous because the environment is: a benchmark you pass today drifts tomorrow, and the practice is to catch the regression before an attacker does.

Frequently asked questions

What are the most important cloud security best practices?

<p>The highest-leverage practices are understanding the shared responsibility model, enforcing least-privilege IAM with MFA, encrypting data with managed keys, fixing misconfigurations against the CIS Benchmarks with a CSPM, centralizing logging to a SIEM, segmenting the network, protecting workloads, automating guardrails, and planning cloud incident response. Identity and configuration come first because they cause the most breaches.</p>

What is the shared responsibility model in cloud security?

<p>It is the split of security duties between provider and customer. The provider is responsible for security of the cloud (the physical data center, hardware, hypervisor, and managed-service internals); the customer is responsible for security in the cloud (their data, identities, and configuration). Where the line falls depends on the service model: IaaS leaves more to the customer, SaaS shifts more to the provider, but data, identity, and configuration always stay with the customer.</p>

What is the most common cause of cloud security breaches?

<p>Misconfiguration. A public storage bucket, an over-privileged identity, an open management port, or a disabled log is a setting left at an insecure value, and it needs no exploit to abuse. Attackers scan for these continuously, which is why measuring against a baseline like the CIS Benchmarks and running continuous posture management matter more than chasing exotic threats.</p>

What is the difference between CSPM and CWPP?

<p>Cloud Security Posture Management (CSPM) scans your cloud configuration against secure-configuration baselines and flags misconfigurations like public buckets and open ports. Cloud Workload Protection Platform (CWPP) protects the workloads themselves, the hosts, containers, and functions, with runtime visibility into processes, file integrity, and behavior. CSPM secures the configuration; CWPP secures what runs. Many tools combine both as a CNAPP.</p>

How do I secure identity and access in the cloud?

<p>Grant least privilege so every identity reaches only the actions and resources it needs, and default to deny. Require MFA everywhere, preferring phishing-resistant factors for privileged access. Prefer short-lived federated credentials over long-lived static keys, rotate any keys you must keep, protect the root or global-administrator account most, and remove permissions and keys that have gone unused.</p>

Do cloud providers encrypt my data automatically?

<p>Some managed services encrypt at rest by default, but do not assume it. Explicitly enable encryption on every store (object storage, block volumes, databases, snapshots) and use the provider's key management service, preferring customer-managed keys so you control the policy and rotation. Enforce TLS in transit with resource policies that reject unencrypted connections. Encryption limits the blast radius when a misconfiguration exposes a store anyway.</p>

Practice track
SOC Analyst Tier 1
Build your foundational skills to monitor, detect, and escalate security alerts. This track includes essential tools, basic log analysis, and introductory incident response labs.
Browse SOC Analyst Tier 1 Labs โ†’