Cross-Site Scripting (XSS): How the Browser Security Model Works and Why It Breaks

Cross-Site Scripting (XSS): How the Browser Security Model Works and Why It Breaks
Cross-Site Scripting (XSS) is a web application vulnerability that allows attackers to inject malicious scripts into web pages viewed by other users. By exploiting the browser's inherent trust in content served from a legitimate origin, XSS enables session hijacking, credential theft, and unauthorized actions, all executed silently within the victim's browser without requiring any server compromise.
Introduction: Why XSS Still Tops the Charts
If you've spent any time in a SOC or reviewed web application penetration test reports, you already know: Cross-Site Scripting vulnerabilities are everywhere. They consistently appear in the OWASP Top 10, they routinely surface in bug bounty programs, and they remain one of the most misunderstood attack classes among developers and security teams alike.
The reason XSS is so persistent and so dangerous isn't that it's technically exotic. It's that it exploits something fundamental: the way browsers are designed to trust content. To truly understand XSS, you need to understand the browser security model that XSS breaks. This article will walk you through the Same-Origin Policy, the DOM trust model, cookie and session token mechanics, and show you precisely, step by step, how an XSS attack turns all of it against the user.
⤠Browser-based threats don't stop at XSS. Walk through a real-world triage scenario in our SOC Simulator: Malware Download Alert Investigation from Browser Telemetry.
The Browser Security Model: Built on Trust
Modern web browsers are, in essence, operating systems for web content. They execute code (JavaScript), manage state (cookies, storage), render interfaces (HTML/CSS), and communicate with remote servers, all simultaneously, often across dozens of open tabs from dozens of different websites.
The central challenge the browser faces is isolation: how do you let bank.com and socialmedia.com run side by side in the same browser without letting one read the other's data?
The answer is the Same-Origin Policy.
H2: Same-Origin Policy (SOP): The Foundation of Browser Security
The Same-Origin Policy is the browser's primary security boundary. It dictates that a script running on one origin cannot access the resources of a different origin unless explicitly permitted.
An origin is defined by three components working together:
|
|
|
|
|
|
|
|
|
|
|
|
Two URLs are "same-origin" only if all three components match exactly. Consider these examples:
- https://bank.com/account and https://bank.com/profile → Same origin ā
- https://bank.com and http://bank.com → Different origin ā (protocol differs)
- https://bank.com and https://api.bank.com → Different origin ā (subdomain differs)
- https://bank.com and https://bank.com:8080 → Different origin ā (port differs)
What SOP Actually Protects
Under SOP, a script from attacker.com cannot:
- Read the DOM contents of a page on victim.com
- Access cookies scoped to victim.com
- Make authenticated requests to victim.com's API and read the responses.
This sounds robust. And for cross-origin attacks, it largely is. But here's the critical insight that underpins every XSS attack:
SOP only restricts cross-origin scripts. It places no restrictions on scripts that run from within the same origin.
If an attacker can get their JavaScript to execute in the context of bank.com, not from attacker.com, but as if it were served by bank.com then SOP is completely irrelevant. The browser has no way to distinguish the attacker's injected script from the site's own legitimate JavaScript.
That is the core exploitation primitive of every Cross-Site Scripting attack.
H2: The DOM Trust Model, How the Browser Decides What to Execute?
The Document Object Model (DOM) is the browser's live, in-memory representation of a web page. Every element on the page, paragraphs, buttons, forms, input fields are nodes in this tree, and JavaScript can read and modify any of it in real time.
The browser's DOM trust model is straightforward and unforgiving: it trusts and executes all JavaScript that is present within the context of a page, regardless of its origin or intent. There is no runtime semantic analysis, no intent detection, and no behavioral sandboxing between the site's own code and attacker-injected code. From the browser's perspective, JavaScript is JavaScript.
Why This Matters for XSS
When a web application takes user-supplied input a search query, a username, a comment, and renders it back into the HTML of a page without proper sanitization or encoding, it inadvertently places attacker-controlled text into the DOM. If that text contains a valid JavaScript expression wrapped in a <script> tag or an inline event handler, the browser will parse it and execute it.
A classic example:
The application renders a "Welcome" message using unsanitized user input:
|
|
If the attacker registers the username:
|
|
The rendered HTML becomes:
|
|
The browser sees valid HTML. It encounters a <script> tag. It executes the code now running in the origin context of the legitimate application. Every privilege that the application's JavaScript has, the attacker's script now also has.
H2: Cookies and Session Tokens: The Primary Target
To understand why XSS is catastrophically dangerous, you need to understand what session cookies are and how authentication works on the web.
How Sessions Work
HTTP is a stateless protocol. Each request is independent, and the server has no inherent memory of prior interactions. To maintain authenticated sessions to know that the person requesting /account/balance is the same person who logged in 10 minutes ago, web applications issue session tokens.
After successful authentication, the server generates a cryptographically random, unique session identifier and sends it to the browser via a Set-Cookie header:
|
|
From that point forward, the browser automatically attaches this cookie to every subsequent request to that domain:
|
|
The server validates the session token against its database and processes the request as that authenticated user. The session token IS the identity. Whoever possesses it can impersonate the user without knowing their password.
⤠See our deep dive on the Credential Theft Guide to understand how attackers harvest and abuse stolen identities across the kill chain.
The HttpOnly Flag: Important, But Not a Silver Bullet
Security-conscious developers set the HttpOnly flag on session cookies. This flag instructs the browser to prevent JavaScript from accessing the cookie via document.cookie it will only be sent automatically on HTTP requests.
This is a meaningful control. It blocks the most basic XSS session-theft payload: JavaScript
|
|
However, HttpOnly does not prevent XSS from hijacking the session. An attacker's injected script running in the victim's browser context can simply use the cookies directly by making requests that the browser will automatically attach them to: JavaScript
|
|
The attacker doesn't need to steal the token. They just need to use the token, and the browser does that for them automatically.
H2: How XSS Steals Sessions: A Step-by-Step Breakdown
Let's walk through a complete attack chain, from vulnerability to session compromise. This is the narrative every SOC analyst needs to be able to reconstruct from logs and explain to application teams.
Step 1: Discovery of an Injection Point
The attacker identifies a parameter that is reflected unsanitized into the page response. This might be a search field, URL parameter, form input, or user profile field (for stored XSS). The input is rendered into the DOM without encoding.
Step 2: Payload Crafting
The attacker constructs a JavaScript payload. In a reflected XSS scenario, they embed it in a URL:
|
|
For HttpOnly cookies, the payload pivots to action-based exploitation: JavaScript
|
|
Step 3: Delivery
The attacker delivers the malicious URL to the victim via phishing email, social media message, or a compromised third-party site. The victim, already logged into victim.com, clicks the link.
Step 4 Execution in Victim's Browser Context
The victim's browser loads victim.com/search?q=[payload]. The server reflects the unsanitized query into the HTML. The browser parses the response, encounters the injected script, and executes it in the context of victim.com.
At this moment, the attacker's code has every privilege of a legitimate script on victim.com. SOP does not intervene when the script is executing from within the origin.
Step 5 Session Compromise
For non-HttpOnly cookies: document.cookie is exfiltrated, and the attacker imports the token into their own browser using developer tools. They are now authenticated as the victim, with no credentials required.
For HttpOnly cookies: the script performs authenticated actions directly (fund transfers, password changes, data exfiltration, adding an attacker-controlled OAuth token), all appearing in server logs as legitimate user activity originating from the victim's IP address.
⤠See Advanced Forensics Techniques for SOC Analysts for the full artifact analysis methodology.
H2: XSS Variants Reflected, Stored, and DOM-Based
Understanding where Cross-Site Scripting injection points live shapes both detection and remediation strategy.
Reflected XSS occurs when the malicious payload is embedded in a request (typically a URL parameter), and the server immediately reflects it back in the response. It requires the victim to click a crafted link. High severity, but limited to single sessions per interaction.
Stored XSS, also called persistent XSS, is considerably more dangerous. The payload is stored in the application's database (in a comment field, profile bio, or forum post) and is served to every user who views that content. A single injection can compromise hundreds or thousands of sessions without any further attacker action.
DOM-Based XSS bypasses the server entirely. The vulnerability lives in client-side JavaScript that reads from an attacker-influenced source (like location.hash) and writes to a dangerous sink (like innerHTML) without sanitization. Standard server-side output encoding does not protect against DOM XSS it requires dedicated client-side security practices.
H2: Defensive Controls What Actually Works
Defending against XSS is a layered effort. No single control is sufficient.
Output encoding is the primary defense. User-supplied data rendered into HTML must be HTML-entity-encoded. Data rendered into JavaScript contexts must be JavaScript-escaped. Context matters; the encoding rules for HTML attributes, HTML body content, JavaScript strings, and URL parameters are all different.
Content Security Policy (CSP) instructs the browser to only execute scripts from approved sources. A well-configured CSP can prevent inline script execution entirely, meaning injected <script> tags and inline event handlers fail to execute. CSP is a powerful compensating control, but misconfiguration is common, and bypass techniques exist.
The HttpOnly and Secure cookie flags limit cookie exposure. HttpOnly prevents direct JavaScript access; Secure ensures cookies are only transmitted over HTTPS. Neither prevents action-based XSS exploitation, but they raise the bar.
Input validation should not be treated as a primary XSS defense; validation rules are frequently bypassable and context-dependent, but it remains a useful defense-in-depth layer.
⤠Catching XSS before it reaches production requires runtime testing. Learn how in the DAST Practical Guide for Security Teams.
Modern frameworks such as React, Angular, and Vue provide automatic output encoding by design. Direct DOM manipulation via innerHTML, document.write, or eval bypasses these protections and should be flagged in code review.
Key Takeaways for Security Teams
Cross-Site Scripting attacks are not brute-force exploits. They are logical exploits of trust, specifically, the browser's trust in content delivered from a recognized origin. The Same-Origin Policy, the DOM trust model, and automatic cookie attachment are features that work exactly as designed. XSS weaponizes them.
As a SOC analyst or security educator, the most important mental model to internalize and to communicate is this: an XSS payload does not need to steal a token to steal a session. It merely needs to execute within the right context. The browser's own security mechanisms become the delivery vehicle.
Understanding this distinction transforms how your teams prioritize alerts, interpret logs, and advise developers on the remediation controls that will actually reduce organizational risk.