What Is Cloud Detection? Signals, Sources, Methods
Cloud detection is the practice of finding malicious or anomalous activity in cloud environments by collecting and analyzing control-plane, identity, network, and runtime telemetry.
The first sign of most cloud intrusions is not malware on a disk. It is an API call. A set of long-lived access keys that show up calling GetCallerIdentity from an IP in a country the account has never operated in, then enumerating S3 buckets, then creating a new IAM user with administrator policy attached. No exploit, no dropped binary, no host to image. Just a sequence of authenticated, authorized API calls that, read together, are an attacker working through a stolen credential. Catching that sequence before it finishes is cloud detection.
Cloud detection is the practice of identifying malicious or anomalous activity inside cloud environments by collecting and analyzing the telemetry those environments emit: control-plane audit logs, identity activity, network flow, and runtime signals from workloads. It is the detection half of the broader cloud detection and response discipline. This guide covers what cloud detection is, why it does not work the way on-prem detection does, the data sources you actually pull from, the techniques that turn those sources into alerts, the threats they catch, and where the whole thing gets hard. It is written for the analysts and detection engineers who have to answer "is this API call an attacker?" with the only evidence the cloud gives them: logs.
What is cloud detection?
Cloud detection is the process of finding threats in a cloud environment by analyzing the signals it produces. A signal is any record of activity: an API call written to an audit log, an authentication event, a network connection, a process spawned inside a container. Detection consumes those signals, evaluates them against known-bad patterns and learned baselines, and raises an alert when activity looks like an attacker rather than normal operations.
The output of cloud detection is a finding: a structured statement that something specific happened, on a specific resource, by a specific identity, at a specific time, with enough context to triage it. A finding is not a response. Stopping the attacker, revoking the key, quarantining the workload, is the job of cloud response, the other half of the discipline. Detection ends at "here is what is happening and why we think it is malicious." Everything after that is remediation.
What makes cloud detection its own topic, rather than just network detection with the servers somewhere else, is where the signals come from and what they look like. On-prem, you watch hosts and the wire. In the cloud, the most important activity never touches a host you control. It happens in the provider's control plane, the management layer where resources are created, permissions are changed, and data is moved, all through APIs. Detection that only watches workloads misses the half of the attack that runs through the console and the SDK.
Why cloud detection differs from on-prem
Five properties of cloud environments break assumptions that on-prem detection is built on. Understanding them is the difference between detection that works and detection that produces confident, useless alerts.
There is no perimeter to watch. On-prem, traffic crosses a firewall, a span port, an IDS sensor, and you inspect it there. In the cloud, there is no single choke point. Resources are reachable over the internet by design, services talk to each other through provider APIs that never traverse your network, and an attacker with a stolen key operates entirely through the provider's front door. The network is no longer where you see the most.
The control plane is the new attack surface. Every meaningful action, create a VM, attach a policy, snapshot a database, make a bucket public, is an API call against the provider's management layer. That control plane is where attackers do most of their damage, and the audit log of those API calls is the single richest detection source the cloud has. On-prem has no real equivalent; the closest thing is the directory, and even that does not log at API granularity.
Detection is identity-centric. With no perimeter, identity becomes the control plane's access boundary, and so it becomes the thing detection watches. The questions are about who: which principal made this call, does this role normally do this, is this access key being used from a new location, did a user just grant themselves a permission they never had. Credential and identity abuse is the dominant cloud attack pattern, so identity telemetry is the dominant detection signal.
Infrastructure is ephemeral. A container lives for ninety seconds. A serverless function exists only for the duration of one invocation. An autoscaling group launches and terminates instances by the hour. There is often no long-lived host to install an agent on, no disk to image after the fact, and no second chance to collect a signal that was not captured while the resource existed. Detection has to be real-time and log-based, because the thing it is watching may be gone before an analyst looks.
Responsibility is shared. Under the cloud shared responsibility model, the provider secures the infrastructure and gives you logs about your use of it; you are responsible for detecting threats in your accounts, identities, and workloads. The provider will not tell you that your access key was stolen. It will faithfully log every call that key makes. Turning those logs into detection is your half of the deal, and it is the half this article is about.
Cloud detection data sources
Detection is only as good as the telemetry feeding it. In the cloud, the sources fall into a handful of categories, each catching a different slice of the attack. The table maps the primary sources to what each one is good for.
| Data source | What it is | What it catches |
|---|---|---|
| Control-plane audit log (AWS CloudTrail, Azure Activity Log, GCP Cloud Audit Logs) | A record of every management API call: who, what, when, from where, allowed or denied | Credential abuse, privilege escalation, resource tampering, public-exposure changes, recon enumeration |
| Identity logs (CloudTrail STS events, Entra ID sign-in logs, IAM activity) | Authentication and authorization events for users, roles, and keys | Impossible-travel logins, MFA bypass, key use from new locations, role assumption chains |
| Cloud-native findings (Amazon GuardDuty, Microsoft Defender for Cloud, Google SCC) | Provider-run detections that analyze the logs above and emit scored findings | Known-bad IPs and domains, crypto mining, anomalous API behavior, recon, exfiltration patterns |
| Network flow logs (VPC Flow Logs, NSG flow logs) | Connection-level metadata: source, destination, ports, bytes, accept or reject | Lateral movement, command-and-control beacons, data egress volume, port scanning |
| Runtime / workload telemetry (eBPF or agent-based sensors) | Process, file, and syscall activity inside running VMs, containers, and pods | In-workload execution, container escape, reverse shells, fileless activity an API log cannot see |
Two of these deserve more than a table cell. The control-plane audit log is the spine. AWS CloudTrail, Azure Activity Log, and GCP Cloud Audit Logs each record management API calls as structured events, and almost every cloud detection rule worth writing starts there. The difference between CloudTrail management events and CloudWatch metrics trips up newcomers constantly; the CloudTrail vs CloudWatch distinction is worth getting straight before you build detections, because one is your audit record and the other is your performance telemetry, and confusing them costs you the source you actually need.
Cloud-native findings services do a chunk of the work for you. Amazon GuardDuty is the clearest example. By AWS's own documentation, GuardDuty's foundational data sources are CloudTrail management events, VPC Flow Logs, and Route 53 Resolver DNS query logs, which it consumes automatically through independent, duplicated streams that do not depend on or alter your own logging configuration. Additional coverage, S3 data-event analysis, EKS audit log monitoring, RDS login activity, Lambda network activity, and runtime monitoring via an agent, comes from optional protection plans you turn on separately. The practical takeaway: GuardDuty out of the box watches your control plane, your network flow, and your DNS, and you opt in to the rest. Knowing exactly which sources a managed detector reads is what tells you where your blind spots are.
Cloud detection techniques
Having the logs is half of it. The other half is the logic that decides which records are worth an analyst's time. Four techniques carry most of the load, and real detection programs run all four at once.
Signature and rule-based detection matches activity against known-bad patterns: a call from a flagged IP, a specific risky API like PutBucketPolicy opening a bucket to the public, a sequence such as CreateUser followed immediately by AttachUserPolicy with an admin policy. Rules are precise and explainable, which makes them the backbone of detection engineering, but they only catch what you already thought to write a rule for.
Anomaly and behavioral baselining learns what normal looks like for each identity and resource, then flags deviation. This is where behavioral analytics earns its place in the cloud: a role that has called three APIs every day for a year suddenly enumerating every bucket in the account is anomalous even though every individual call is authorized. Baselining catches the novel attack a signature would miss, at the cost of tuning, because "different" and "malicious" are not the same and the gap between them is alert noise.
Threat intelligence enrichment scores activity against external indicators: known malicious IPs, Tor exit nodes, command-and-control domains, hashes tied to active campaigns. Intel turns a bland "API call from 203.0.113.10" into "API call from an IP on a current botnet list," which is the difference between an event and a finding. Its limit is freshness; stale intel is worse than none because it breeds false confidence.
Posture and configuration drift detects the change that creates exposure rather than the exploit that uses it: a security group opened to 0.0.0.0/0, encryption disabled on a database, a public snapshot, a logging trail switched off. This is detection of the precondition for attack, and it overlaps with cloud security posture management, but in a detection context the drift event itself, captured in the audit log, is the alert.
Common cloud threats detection catches
These techniques and sources exist to catch a recognizable set of cloud attacks. Five recur often enough that detection programs are built around them.
Credential abuse. Stolen or leaked access keys used to authenticate as a legitimate principal. The signal is in identity and control-plane logs: use from a new IP or region, a sudden change in which APIs a key calls, authentication without the usual MFA. Because the calls are authenticated and authorized, only behavior reveals them. This is the most common cloud attack and the hardest to catch with rules alone.
Privilege escalation. An identity granting itself or another principal more access: attaching an admin policy, creating access keys for another user, assuming a more powerful role, editing a trust policy. Each step is a discrete API call in the audit log, and the escalation chain is a strong, writable detection pattern.
Crypto mining. Compromised compute, often spun up through stolen credentials, running mining workloads. Detection comes from network signals to mining pools, anomalous spikes in compute spend, and provider findings. GuardDuty calls these out directly from its foundational sources.
Data exfiltration. Sensitive data copied out of the environment: mass S3 object reads, a database snapshot shared to an external account, large egress volumes in flow logs. The audit log shows the access; the flow log shows the movement; together they show the theft.
Public-resource exposure. A storage bucket, database, or snapshot made publicly reachable, by mistake or by an attacker establishing access. The configuration-change API call is the detection point, caught the moment the resource is opened rather than after someone finds it.
Where cloud detection gets hard
Cloud detection fails in predictable ways, and the failures are operational more than technical.
Alert volume. A busy account generates millions of API events a day, and naive detection drowns the analyst. The work is tuning: suppressing known-good automation, scoping rules tightly, ranking findings so the credential-abuse alert is not buried under a thousand benign configuration changes. Volume is why behavioral baselining needs investment, not just enablement.
Multi-cloud fragmentation. AWS, Azure, and GCP each have their own log formats, their own native detector, their own API vocabulary. A team running all three is correlating CloudTrail, Activity Log, and Cloud Audit Logs in three different shapes, often in three different consoles. Normalizing that into one detection pipeline, frequently a SIEM, is real and continuous engineering.
Log gaps. Detection only sees what is logged, and logging is opt-in for the sources that matter most. CloudTrail data events, the S3 object-level reads that reveal exfiltration, are off by default and cost money to turn on. Flow logs may not cover every VPC. A region with no trail is a blind spot an attacker can operate in undetected. The first detection engineering task in any cloud account is auditing what is actually being logged, because a missing source is a missing detection no rule can compensate for.
Ephemerality and scale. The signal has to be captured in real time because the resource that produced it is transient, and the volume of resources is large and constantly changing. Detection that assumes a stable inventory of long-lived hosts does not survive contact with an autoscaling, serverless environment.
These challenges are why detection is only half the discipline. Catching the activity is necessary but not sufficient; the value is realized only when a finding drives a fast, often automated response. That handoff, and the remediation that follows, is the subject of cloud response and the combined cloud detection and response practice. This article stops at the finding.
The bottom line
Cloud detection is finding threats in the cloud from the only evidence the cloud reliably gives you: logs. The control-plane audit log is the spine, identity is the center of gravity, and network flow and runtime telemetry fill in what the API record cannot see. The techniques, signatures, behavioral baselining, threat intel, and posture drift, run together because no single one catches everything, and the threats they exist for, credential abuse, privilege escalation, crypto mining, exfiltration, and public exposure, almost all surface first as authorized-but-anomalous API activity.
The hard parts are operational: alert volume, multi-cloud fragmentation, and the log gaps that turn an un-instrumented region into a blind spot. Audit what you are actually logging before you write a single rule, because a detection you cannot feed is a detection you do not have. And remember where this stops. Detection produces the finding; turning that finding into a contained incident is cloud response, and the two together are cloud detection and response.
Frequently asked questions
<p>Cloud detection is finding threats inside a cloud environment by analyzing the telemetry it produces: audit logs of API calls, identity and sign-in events, network flow logs, and runtime signals from workloads. It evaluates that activity against known-bad patterns and learned baselines, then raises a finding when something looks like an attacker. It is the detection half of cloud detection and response; stopping the threat is cloud response.</p>
<p>On-prem detection watches hosts and network traffic at a perimeter. Cloud detection has no single perimeter to watch, so it centers on the provider's control plane (the API layer where resources and permissions change) and on identity, because most cloud attacks run through stolen credentials. It also has to work in real time against ephemeral resources like containers and serverless functions that may not exist long enough to image after the fact.</p>
<p>The primary sources are control-plane audit logs (AWS CloudTrail, Azure Activity Log, GCP Cloud Audit Logs), identity and sign-in logs, network flow logs (VPC Flow Logs), runtime telemetry from agents or eBPF sensors inside workloads, and cloud-native findings services like Amazon GuardDuty, Microsoft Defender for Cloud, and Google Security Command Center that analyze those logs for you.</p>
<p>The recurring ones are credential abuse (stolen access keys used as a legitimate identity), privilege escalation (an identity granting itself more access), crypto mining on compromised compute, data exfiltration (mass reads or snapshots copied out), and public-resource exposure (a bucket or database opened to the internet). Most show up first as anomalous but authorized API activity in the audit log.</p>
<p>Per AWS documentation, GuardDuty's foundational data sources are CloudTrail management events, VPC Flow Logs, and Route 53 Resolver DNS query logs, which it consumes automatically without changing your own logging setup. Additional coverage, such as S3 data-event analysis, EKS audit logs, RDS login activity, Lambda network activity, and runtime monitoring, comes from optional protection plans you enable separately.</p>
<p>Because the cloud has no network perimeter, identity becomes the access boundary for the control plane, and stolen credentials are the most common way attackers operate. Detection therefore focuses on who: which principal made a call, whether a role is behaving normally, whether a key is being used from a new location, and whether a user just granted itself new permissions. Identity telemetry is the highest-value cloud detection signal.</p>