Threat Intelligence

What Is Open-Source Intelligence (OSINT)?

14 min read·Updated June 2026·Threat Huntingthreat intelligenceBlue TeamSOCCyber Threat intelligence

Before an attacker sends a single packet, they can already know your org chart, who reports to whom, which employee just posted about deploying a new tool, the email format from a leaked address, and the name of a security product mentioned in a job posting. None of it required hacking. It came from LinkedIn, a conference talk on YouTube, a public code repository, a job board, and a credential dump on a paste site. By the time the phishing email arrives, it is tailored: right name, right tone, right pretext. That tailoring is the product of open-source intelligence.

OSINT (open-source intelligence) is intelligence collected from publicly and legally available sources, then analyzed to answer a question. "Open source" here means open information, anything anyone can access without breaking in, not open-source software. It is the same discipline whether an attacker uses it to plan an intrusion or a defender uses it to find what an attacker would see. The raw material is public; the intelligence is in the collection and analysis.

This guide covers what OSINT is, the cycle that turns public data into intelligence, where the data comes from, how both attackers and defenders use it, the common tools and techniques, and where the legal and ethical line sits. It is written for blue teamers who need to understand both sides of it: what an adversary can learn about you, and how to use the same sources to defend.

What is open-source intelligence?

Open-source intelligence is the practice of gathering information from publicly available sources and analyzing it to produce something useful, a profile, a lead, an answer. The defining trait is the source: everything is obtained legally from places open to the public. No exploitation, no unauthorized access, no private data taken by force. The skill is not access; it is knowing where to look and how to turn scattered public facts into a coherent picture.

The word "intelligence" matters as much as "open source." A list of employee names scraped from a website is data, not intelligence. It becomes intelligence when it is collected with a purpose, corroborated, and analyzed into an answer to a specific question: who in this company has privileged access, what technologies do they run, where is their attack surface exposed. OSINT is the process that closes that gap between raw public data and a decision someone can act on.

OSINT is a foundation of cyber threat intelligence, but it is broader than security. The same discipline is used in investigations, journalism, due diligence, and law enforcement. In a cybersecurity context, it cuts both ways: it is the first phase of most attacks and a core tool of the defenders trying to stop them.

The OSINT cycle

Good OSINT is not random searching; it follows the intelligence cycle, the same loop intelligence work has always used, adapted to public data.

Direction. Start with a question. "Find this person's exposed accounts," "map this company's internet-facing assets," "attribute this phishing kit." Without a defined requirement, collection becomes aimless scraping that produces volume, not answers.
Collection. Gather data from the relevant sources, search engines, social media, public records, technical databases, the dark web, against the requirement. This is the phase most people picture when they think of OSINT, but it is only one step.
Processing. Turn raw collection into usable form: deduplicate, translate, extract entities, organize. A thousand scraped posts are useless until they are structured.
Analysis. Connect and interpret the processed data into intelligence. This is where corroboration happens, where a name, an email, and a breach record become "this account is reused and exposed," and where the answer to the original question actually forms.
Dissemination. Deliver the finding to whoever needs it in a form they can use, a report, an indicator feed, a briefing.

The cycle loops: the analysis usually raises new questions that feed the next round of direction. Treating OSINT as a one-shot search is the most common way it produces noise instead of intelligence.

Where OSINT data comes from

Public sources are vast. They group into a handful of categories worth knowing.

The open web and search engines. Websites, news, blogs, and the advanced search operators ("Google dorking") that surface exposed files, login pages, and misconfigured services that were never meant to be indexed.
Social media. LinkedIn for org structure and roles, and other platforms for personal details, locations, and connections, the raw material for pretexting and phishing.
Public records and registries. Domain WHOIS data, corporate filings, court records, and government databases that tie people, companies, and infrastructure together.
Technical databases. Services that index internet-facing hosts, certificates, DNS records, and open ports, letting anyone map an organization's external attack surface without touching it directly.
Code and document repositories. Public repositories where developers leak API keys, credentials, internal hostnames, and architecture details in commits and config files.
The deep and dark web. Paste sites, breach dumps, and underground forums where leaked credentials and stolen data surface, often the first place an exposed password appears.

The pattern across all of them: each holds a fragment, and OSINT is the work of assembling the fragments into a picture none of the sources shows on its own.

How attackers use OSINT

OSINT · reconnaissance (TA0043)

Five public sources. One tailored email. Zero alerts.

By the time the phishing email arrives, it is tailored. None of the collection touched the target's network.

SOURCE

LinkedIn

Org chart and who reports to whom.

SOURCE

YouTube talk

A conference talk naming tools in use.

SOURCE

Public repo

Internal hostnames and config details.

SOURCE

Job board

A posting naming a security product.

SOURCE

Paste-site dump

The email format from a leaked address.

↓

TAILORED PHISHING EMAIL

Right name, right tone, right pretext.

Five public fragments assembled into one targeted lure. No hacking required.

Invisible · no alerts None of this collection touched the target's systems, so it generated no alerts. You cannot detect the recon. You can only reduce what it finds.

For an attacker, OSINT is reconnaissance, the first move in almost every targeted intrusion. MITRE ATT&CK catalogs it as the Reconnaissance tactic (TA0043), which covers gathering information from open sources to support targeting, before any system is touched.

The payoff is a tailored attack. Public data tells an attacker who to target (the finance staff who can move money, the admins with privileged access), how to reach them (email formats, phone numbers, social accounts), and what pretext will work (a current project, a vendor relationship, an event the target attended). It reveals the technology in use, from job postings naming specific security products to metadata and headers exposing software versions, which shapes which exploits to bring. And it surfaces already-leaked credentials that may grant access with no exploitation at all.

The critical point for defenders is that this entire phase is invisible. Reconnaissance from public sources generates no alerts on your systems because it never touches them. By the time the attack is visible, the homework is already done. You cannot detect the collection; you can only reduce what it finds.

How defenders use OSINT

The same discipline is one of the defender's most useful tools, and using it well means seeing yourself the way an attacker does.

Attack surface discovery. Use the same technical databases an attacker uses to find your own exposed assets, forgotten subdomains, open ports, expired certificates, and shadow IT, before someone else does. You cannot defend an asset you do not know is public.

Leaked credential and data monitoring. Watch breach dumps and paste sites for your organization's credentials and data so you can force resets and respond before the exposure is used. This is one of the highest-value defensive uses of OSINT.

Threat intelligence and attribution. OSINT feeds threat hunting and investigation: researching a malicious domain, enriching an indicator, or profiling a threat actor from their public infrastructure and leaks.

Brand and executive exposure. Monitor for impersonation domains, spoofed social accounts, and the personal-information exposure of high-value employees that enables targeted attacks.

A concrete defensive workflow ties these together. A team profiles its own external footprint with an internet-scanning database and finds a forgotten staging server exposing a login page. They cross-reference the company's email format against recent breach dumps and discover three reused passwords for staff with remote access. They check a code-sharing site and find an old commit leaking an internal hostname and an API token. None of that required touching production, and each finding is a fix: take the server down, force the resets, rotate the token. That is the homework an attacker would have done, run first by the defender.

The unifying idea is the attacker's-eye view. Running the same collection against your own organization tells you what the reconnaissance phase of an attack would turn up, which is exactly the list of things worth fixing.

The two sides use the same sources toward opposite ends:

Source	Attacker use	Defender use
Social media / LinkedIn	Build org charts and phishing pretexts	Spot executive and staff over-exposure
Internet-scanning databases	Map external attack surface to target	Find own exposed and forgotten assets
Breach dumps / paste sites	Reuse leaked credentials for access	Force resets before the leak is used
Code repositories	Harvest leaked keys and hostnames	Catch secrets committed by developers
WHOIS / DNS records	Map infrastructure and ownership	Enrich indicators and attribute threats

OSINT tools and techniques

The technique behind all of it is pivoting: start with one known fact, an email, a domain, a username, and use it to find connected facts, then pivot again, building outward until the picture is complete. The tools exist to automate and scale that pivoting.

A practitioner's toolkit usually spans a few categories: search engines with advanced operators for the open web; data-aggregation and link-analysis tools (such as Maltego) that map relationships between entities; internet-scanning databases (such as Shodan) for external infrastructure; and harvesters (such as theHarvester) that collect emails, names, and subdomains for a target. The OSINT Framework is a widely used directory that organizes these resources by what you are trying to find. The specific tool matters less than the method: define the question, collect against it, and pivot from each finding to the next.

A word of caution on tooling: automated collection scales, but it also produces noise and false connections. The analysis step, human judgment confirming that two data points actually refer to the same entity, is what keeps automated OSINT honest.

The legal and ethical line

OSINT uses public sources, but "public" is not the same as "anything goes." The legal line is access: collecting information that is openly available is legitimate, while logging into accounts that are not yours, using leaked credentials to access systems, or social-engineering private data crosses from OSINT into intrusion. Looking at a public profile is OSINT; using a password from a breach dump to log in is a crime.

For defensive work, the practical rules are straightforward. Stay within passive collection of genuinely public data when assessing your own exposure, get authorization in writing before OSINT becomes part of a sanctioned engagement against any target, and handle any sensitive data you find (especially leaked personal information) under the same privacy obligations as any other sensitive data. The discipline is professional, not a license to snoop.

Getting started with OSINT

If you want to build the skill, work it from both sides: collect, and then analyze what you collected into an answer.

Learn the sources and operators. Get fluent with search operators, WHOIS and DNS lookups, and the major technical databases. Knowing where a given fact lives is half the skill.
Practice pivoting. Take one data point and see how far you can extend it through public sources, always corroborating before you connect.
Run OSINT against yourself. Profile your own digital footprint, or your organization's external attack surface, the way an attacker would. It is legal, safe, and immediately useful.
Turn collection into intelligence. Practice the analysis step, structuring findings and answering a defined question, on real investigation scenarios.

The bottom line

OSINT is the discipline of turning publicly available information into intelligence, and in cybersecurity it runs in both directions. Attackers use it as silent reconnaissance to tailor an intrusion before they ever touch a system, which means the most important reconnaissance phase of an attack produces no alerts at all. Defenders use the same sources and the same cycle, direction, collection, processing, analysis, to see their own exposure first: the leaked credential, the forgotten subdomain, the over-shared employee. The data is public to everyone; the advantage goes to whoever collects and analyzes it better. For a defender, that means looking at your organization the way an attacker would, and fixing what you find before they use it.

Frequently asked questions

What does "open source" mean in OSINT?

It means openly available information, any source the public can access legally, not open-source software. That includes websites, social media, public records, news, technical databases, and leaked data already exposed online. The term predates software licensing and refers to the openness of the information, not the code. The defining trait is that no unauthorized access is needed to obtain it.

Is OSINT legal?

Collecting genuinely public information is legal, which is the foundation of OSINT. The line is crossed when collection turns into access: logging into accounts that are not yours, using leaked credentials to enter systems, or social-engineering private data are not OSINT, they are intrusion. Viewing a public profile or a breach record is legal; using what you find to gain unauthorized access is not. For sanctioned engagements, get authorization in writing.

What is the difference between OSINT and threat intelligence?

OSINT is a collection discipline, gathering intelligence from public sources. Threat intelligence is the broader practice of producing actionable knowledge about threats, which uses OSINT as one major source alongside private telemetry, commercial feeds, and internal data. OSINT feeds threat intelligence; it is not the whole of it. Much of cyber threat intelligence, especially infrastructure and actor research, is OSINT-driven.

What are common OSINT tools?

Common categories include search engines with advanced operators for the open web, link-analysis and aggregation tools like Maltego, internet-scanning databases like Shodan, and harvesters like theHarvester for emails and subdomains. The OSINT Framework organizes these resources by goal. The tool matters less than the method: define a question, collect against it, and pivot from each finding. Human analysis to confirm connections is what keeps automated collection accurate.

How do attackers use OSINT?

Attackers use OSINT as reconnaissance (MITRE ATT&CK tactic TA0043) to plan targeted intrusions before touching any system. Public data reveals who to target, how to reach them, what pretext will work for phishing, what technology is in use, and whether credentials are already leaked. Because this collection never touches the target's systems, it generates no alerts, which is why it is invisible and why reducing public exposure is the main defense.

Can you defend against OSINT?

You cannot detect or block the collection itself, since it happens on public sources you do not control. The defense is reducing what it finds: minimize unnecessary public exposure, monitor for leaked credentials and impersonation, run OSINT against yourself to see your own attack surface, and train high-value staff on what they expose. You shrink the attack's homework rather than stopping the research.