Glossary/Detection Engineering/Log Retention

What Is Log Retention? A Defender's Guide

Log retention is the policy and mechanism that determines how long log data is kept, on what storage, and when it is deleted.

A breach is confirmed on June 1. The first sign of compromise traces back to a phishing email opened in February. The responder pivots to the authentication logs to scope what the stolen credential touched, and the query returns nothing before April. The logs that would have shown the initial access, the first lateral move, and the early staging were rotated off disk weeks ago. The investigation now has a four-month attack and two months of evidence. That gap is a retention failure, and it is the most common way an organization that collected the right logs still cannot answer what happened.

Log retention is the policy and mechanism that decides how long log data is kept, on what storage, and when it is deleted. It is the difference between a log archive that can reconstruct a months-old intrusion and one that cannot. This guide covers what log retention is, why the default of short rotation fails, the drivers that set a retention period, how tiered storage makes long retention affordable, and the practices that keep a retention policy both useful and defensible. It is written for the people who depend on old logs: responders scoping an incident, threat hunters building baselines, and analysts answering an auditor.

What is log retention?

Log retention is the practice of storing log data for a defined period before it is archived or deleted, governed by a written policy that states how long each type of log is kept and where. It answers one question for every log source: when an event from N days ago is needed, is it still there?

Retention is distinct from collection. Collection gets the event off its source and into a store, often a security information and event management platform. Retention decides how long that store holds it. An environment can collect every log perfectly and still lose an investigation because the data aged out before anyone went looking. Retention is also distinct from the storage tier the data sits on: a single retention period is usually served across several tiers, hot for recent data and colder, cheaper tiers for older data, so that a long retention window does not mean a large fast-storage bill.

The unit is the log event, the same timestamped record a logging pipeline collects. Retention attaches a clock to that record. Different log types get different clocks: a high-volume debug log might be kept for days, while authentication and audit logs that carry security and compliance weight are kept for a year or more. A retention policy is the set of those clocks, written down, applied consistently, and enforced by the platform.

Why short rotation fails

Every system rotates logs by default. Linux logrotate compresses and deletes old files on a schedule. The Windows Event Log overwrites the oldest entries when its fixed-size buffer fills. Cloud services expire log streams after a default window. Left at these defaults, an environment keeps days or weeks of history, and that window is far shorter than the timeline of a real intrusion.

The core problem is dwell time. Attackers are often present in an environment for weeks or months before detection. Mandiant's M-Trends 2026 report put the global median dwell time at fourteen days for 2025, up from eleven the year before, but that median hides a long tail: espionage intrusions and similar long-term campaigns ran a median of about 122 days, roughly four months, and many breaches found by an external party rather than the victim run longer still. When detection happens on day ninety and the logs only go back thirty days, the first two thirds of the attack are gone. The evidence of initial access, persistence, and early lateral movement, the part that tells you how they got in and what else they touched, is exactly the part that short rotation deletes first.

Short retention also breaks baselining. To know that a host reaching out to a new external IP at 3 a.m. is anomalous, you need months of history showing it never did that before. A hunt for slow, low-volume activity (a beacon that checks in once a day, a credential used once a week) needs a long enough window to see the pattern at all. Thirty days of logs cannot establish a seasonal or monthly baseline.

Then there is compliance and legal exposure. Many regulations mandate a minimum retention period, and an organization that cannot produce the required history fails the audit regardless of how good its detection is. During litigation or a regulator's investigation, an organization may be under a legal hold that forbids deleting relevant logs, and an automated rotation job that quietly deletes them anyway is spoliation of evidence. Short, unmanaged rotation is a liability, not just a blind spot.

What drives a retention period

There is no single correct retention number. The period for each log type is set by balancing four drivers against cost.

Compliance and regulatory mandates

Many frameworks set a floor. PCI DSS requires that audit log history be retained for at least one year, with at least the most recent three months immediately available for analysis. HIPAA requires certain documentation be retained for six years. SOX, GDPR, and sector regulations carry their own requirements. The rule is to retain for the longest applicable mandate, and to check the actual regulation text rather than assume, because the figures differ by framework and change over time.

Investigation and dwell time

Retention should outlive realistic attacker dwell time, so that when an intrusion is found late, incident response still has logs that cover its beginning. Because external-notified breaches and slow espionage campaigns can run many months, security-relevant logs (authentication, endpoint, network, and cloud control plane) are commonly retained for a year or more, well beyond the compliance floor, specifically so a late-discovered incident can still be fully scoped.

Operational and cost limits

Logs are large, and high-fidelity sources are the largest. Retaining everything at full fidelity on fast storage is expensive, so retention is rarely uniform: noisy, low-value sources get short windows, and high-value security sources get long ones. The lever that makes long retention affordable is the storage tier, covered next.

Legal hold

A legal hold overrides the normal schedule. When litigation or an investigation is reasonably anticipated, the relevant logs must be preserved beyond their usual period and exempted from automated deletion until the hold is lifted. A retention system has to support pausing deletion for specific data, or it will destroy evidence on schedule.

How tiered retention controls cost

Log retention · the tiered lifecycle
One event ages through tiers, then it is deleted
As a log event gets older it moves to cheaper, slower storage. Retention policy sets the total lifespan, tiering sets what it costs.
HOT · DAYS TO WEEKS
Instant, fully indexed
Live detection, alert triage, active investigation. Highest cost.
WARM · WEEKS TO MONTHS
Slower, still searchable
Recent-history hunting and incident scoping. Medium cost.
COLD · MONTHS TO YEARS
Archive, rehydrate to query
Compliance, late-discovered breach forensics, legal hold. Lowest cost.
DELETE
End of retention
Removed on schedule, unless a legal hold pauses deletion.
How long to keep it Set each log type's lifespan to the longest real driver: the compliance mandate (PCI DSS requires one year of audit logs), realistic attacker dwell time (the M-Trends 2026 median is 14 days but espionage cases run about 122), and any legal hold. Security-relevant logs commonly run a year or more.

Long retention would be unaffordable if every log sat on fast storage for its whole life. Tiered storage solves that by matching the storage cost to how the data is actually used over time. The same event moves through tiers as it ages, getting cheaper and slower to query, until it is deleted.

TierTypical ageQuery speedCostUsed for
HotRecent (days to weeks)Instant, fully indexedHighestLive detection, alert triage, active investigation
WarmOlder (weeks to months)Slower, still searchableMediumRecent-history hunting, incident scoping
Cold / archiveOld (months to years)Slow, often must be rehydratedLowestCompliance, late-discovered breach forensics, legal hold

Recent data, the data queried constantly for live detection and triage, stays hot and fully indexed for instant search. As it ages past the point of daily use, it moves to warm storage that is cheaper and a little slower but still searchable. Old data that is kept for compliance or the rare late-breaking investigation moves to cold or archive storage, the cheapest tier, where retrieval may take time but the cost of holding years of logs stays manageable. Retention policy and tiering work together: the policy sets the total lifespan, and the tiers set what that lifespan costs.

Log retention vs. log archiving

Retention and archiving are related but not the same, and the distinction matters when writing a policy.

DimensionLog retentionLog archiving
ScopeThe whole lifespan policy, from ingest to deletionThe long-term, cold-storage stage of that lifespan
ConcernHow long to keep each log type, and when to deleteStoring aged data cheaply and durably for the long term
Data stateSpans hot, warm, and cold tiersCompressed, immutable, slow to query
Primary driverInvestigation need plus compliance mandateCost and durability of long-term storage
Typical useLive detection through late forensicsCompliance evidence, late-discovered breach, legal hold

Retention is the umbrella policy that governs a log's entire life. Archiving is one stage inside it: the cold, durable, low-cost storage where data lives out the long tail of its retention period. You set a retention policy; archiving is how you afford the end of it.

Best practices for log retention

A retention policy is only as good as its enforcement and its fit to real needs. A few decisions separate a defensible policy from an expensive or useless one.

Write the policy down, per log type. Retention should be a documented policy that states the period for each category of log, not an accident of default rotation settings. Authentication, audit, endpoint, network, and cloud control-plane logs each get an explicit period. A written policy is also what you show an auditor.

Set the period to the longest real driver. For each log type, take the maximum of the compliance mandate, realistic dwell time, and any contractual requirement. Security-relevant logs generally need a year or more so a late-discovered intrusion can still be scoped, which is often longer than the bare compliance floor.

Tier storage to make long retention affordable. Keep recent data hot for fast detection, age older data to warm, and push compliance and long-tail data to cold or archive storage. Tiering is what lets you retain for a year or more without paying fast-storage prices for the whole window.

Protect retained logs from tampering and early deletion. Retained logs are evidence, so the archive is a target and a liability. Store long-term logs as immutable or write-once where possible, restrict who can delete, and log deletions themselves. An attacker or an insider who can quietly shorten retention or wipe the archive defeats the whole point.

Support legal hold and exceptions. The system must be able to pause deletion for specific data under a legal hold and keep it until the hold lifts. Automated deletion that cannot be selectively suspended will destroy evidence that you are legally required to preserve.

Review retention against changing rules and cost. Compliance mandates change and log volumes grow. Revisit the policy periodically so it still meets current regulations and still fits the budget, and so retired log sources are not silently kept forever.

Frequently Asked Questions

What is log retention?

Log retention is the policy and mechanism that determines how long log data is kept before it is archived or deleted, and on what storage it lives during that time. It is governed by a written policy that sets a retention period for each type of log, driven by investigation needs and compliance requirements, so that when an event from the past is needed, it is still available.

How long should logs be retained?

There is no single number. Set each log type's period to the longest applicable driver: the relevant compliance mandate, realistic attacker dwell time, and any contractual or legal requirement. Many regulations set a floor, for example PCI DSS requires at least one year of audit log history. Security-relevant logs are often kept a year or more so a late-discovered breach can still be fully scoped.

What is the difference between log retention and log archiving?

Log retention is the overall policy governing a log's entire lifespan, from ingest through deletion, across hot, warm, and cold storage. Log archiving is one stage inside that lifespan: moving aged data to cheap, durable, long-term storage. Retention decides how long to keep data; archiving is how that long-term storage is done affordably.

Why does log retention matter for incident response?

Attackers often dwell in an environment for weeks or months before detection, so an intrusion is frequently discovered long after the initial compromise. If logs are only kept for a short window, the evidence of how the attacker got in and what they touched early on is already deleted. Adequate retention is what lets a responder reconstruct the full attack timeline rather than only its final days.

How does tiered storage reduce retention cost?

Tiered storage matches storage cost to how data is used as it ages. Recent data stays on fast, fully indexed hot storage for live detection. Older data moves to cheaper warm storage that is still searchable, and the oldest data moves to low-cost cold or archive storage for compliance and rare late investigations. This lets an organization retain logs for a year or more without paying fast-storage prices for the entire period.

What compliance frameworks require log retention?

Many do, with different periods. PCI DSS requires at least one year of audit log retention with three months readily available. HIPAA requires six years of certain records. SOX, GDPR, and various sector and national regulations impose their own requirements. Because the figures differ and change, set retention to the longest applicable mandate and verify the requirement against the current regulation text rather than assuming a number.

Can retained logs be deleted during an investigation?

No. When litigation or an investigation is reasonably anticipated, a legal hold requires the relevant logs to be preserved beyond their normal retention period and exempted from automated deletion until the hold is lifted. Deleting them on the usual schedule anyway is spoliation of evidence. A retention system must be able to selectively suspend deletion for data under hold.

The bottom line

Log retention is the policy and mechanism that decides how long log data is kept, on what storage, and when it is deleted. It exists because the default of short rotation deletes the evidence of an attack long before the attack is discovered, and because compliance and legal obligations demand a defined, defensible history. The period for each log type is set by the longest real driver: compliance mandate, attacker dwell time, and legal hold, balanced against cost through tiered storage.

For a defender, retention is what makes old logs answer new questions. A late-discovered breach can only be scoped if the logs from its beginning still exist, a baseline can only be built from months of history, and an auditor can only be satisfied by a documented, enforced policy. Write the policy down per log type, set each period to the longest real driver, tier storage so long retention stays affordable, protect the archive from tampering and early deletion, and support legal hold. Do that, and the four-month attack discovered on day ninety is still four months of evidence, not two.

Frequently asked questions

What is log retention?

<p>Log retention is the policy and mechanism that determines how long log data is kept before it is archived or deleted, and on what storage it lives during that time. It is governed by a written policy that sets a retention period for each type of log, driven by investigation needs and compliance requirements, so that when an event from the past is needed, it is still available.</p>

How long should logs be retained?

<p>There is no single number. Set each log type's period to the longest applicable driver: the relevant compliance mandate, realistic attacker dwell time, and any contractual or legal requirement. Many regulations set a floor, for example PCI DSS requires at least one year of audit log history. Security-relevant logs are often kept a year or more so a late-discovered breach can still be fully scoped.</p>

What is the difference between log retention and log archiving?

<p>Log retention is the overall policy governing a log's entire lifespan, from ingest through deletion, across hot, warm, and cold storage. Log archiving is one stage inside that lifespan: moving aged data to cheap, durable, long-term storage. Retention decides how long to keep data; archiving is how that long-term storage is done affordably.</p>

Why does log retention matter for incident response?

<p>Attackers often dwell in an environment for weeks or months before detection, so an intrusion is frequently discovered long after the initial compromise. If logs are only kept for a short window, the evidence of how the attacker got in and what they touched early on is already deleted. Adequate retention is what lets a responder reconstruct the full attack timeline rather than only its final days.</p>

How does tiered storage reduce retention cost?

<p>Tiered storage matches storage cost to how data is used as it ages. Recent data stays on fast, fully indexed hot storage for live detection. Older data moves to cheaper warm storage that is still searchable, and the oldest data moves to low-cost cold or archive storage for compliance and rare late investigations. This lets an organization retain logs for a year or more without paying fast-storage prices for the entire period.</p>

What compliance frameworks require log retention?

<p>Many do, with different periods. PCI DSS requires at least one year of audit log retention with three months readily available. HIPAA requires six years of certain records. SOX, GDPR, and various sector and national regulations impose their own requirements. Because the figures differ and change, set retention to the longest applicable mandate and verify the requirement against the current regulation text rather than assuming a number.</p>

Practice track
SOC Analyst Tier 1
Build your foundational skills to monitor, detect, and escalate security alerts. This track includes essential tools, basic log analysis, and introductory incident response labs.
Browse SOC Analyst Tier 1 Labs โ†’