Authentication
Many websites require authentication before they serve real content. CSP Analyser supports three authentication patterns, each suited to a different workflow.
Storage State File
The recommended approach for repeatable, automated analysis. A storage state file is a JSON snapshot of cookies, localStorage, and sessionStorage exported by Playwright.
Generating a storage state file
Use the interactive command with --save-storage-state to log in manually and export the session:
# Open a headed browser, log in, browse around, then close the browser
csp-analyser interactive https://app.example.com --save-storage-state auth.jsonWhen you close the browser, the session's cookies, localStorage, and sessionStorage are saved to auth.json with secure file permissions (0600). Permissions are enforced even when overwriting an existing file, and symlink targets are rejected. You can then reuse this file for headless crawls.
Workflow: interactive login → headless crawl
# Step 1: Log in interactively and save the session
csp-analyser interactive https://app.example.com --save-storage-state auth.json
# Step 2: Use the saved session for a deep headless crawl
csp-analyser crawl https://app.example.com --storage-state auth.json --depth 3 --max-pages 50This is the recommended workflow for authenticated sites. Log in once interactively, then run repeatable headless crawls with the saved state.
You can also generate a storage state file with Playwright directly:
npx playwright codegen --save-storage=auth.json https://app.example.comUsing a storage state file
csp-analyser crawl https://app.example.com --storage-state auth.json// start_session or crawl_url tool
{
"targetUrl": "https://app.example.com",
"storageStatePath": "/absolute/path/to/auth.json"
}The file must have a .json extension and must exist on disk. Symlinks are resolved to prevent path traversal. The real target must also end in .json.
Cookie Injection
Available through the MCP start_session, crawl_url, and audit_policy tools when you already have session cookies (e.g., extracted from browser DevTools or a login API response). Cookies are injected into a fresh browser context before navigation.
// MCP start_session tool - cookies parameter
{
"targetUrl": "https://app.example.com",
"cookies": [
{
"name": "session_id",
"value": "abc123",
"domain": "app.example.com",
"path": "/",
"httpOnly": true,
"secure": true,
"sameSite": "Lax"
}
]
}Only name and value are required. Optional fields are domain, path, httpOnly, secure, and sameSite (Strict, Lax, or None). If domain is omitted, it defaults to the target URL's hostname. If path is omitted, it defaults to /.
Cookie names and values are validated against RFC 6265:
- Names must be valid HTTP tokens (no control characters, spaces, or separators)
- Values must not contain control characters, semicolons, spaces, double quotes, commas, or backslashes
Manual Login (Interactive Mode)
For sites with complex login flows (MFA, CAPTCHA, SSO redirects) that cannot be captured as cookies or storage state.
csp-analyser interactive https://app.example.comThis opens a headed (visible) Chromium browser. You log in manually, browse the pages you want to analyse, and close the browser tab when done. CSP Analyser captures violations in real time while you browse.
WARNING
Manual login requires headed mode. It cannot be used in headless environments (CI, SSH without X11, containers). Use a storage state file for those environments.
Choosing an auth pattern
| Pattern | Best for | Repeatable | CI-friendly |
|---|---|---|---|
| Storage state | Automated crawls, CI pipelines | Yes | Yes |
| Cookie injection | MCP agent workflows | Yes | Yes |
| Manual login | Complex auth flows, initial exploration | No | No |
sessionStorage and token-based auth (MSAL / Azure AD B2C)
Many modern SPAs use token-based authentication (MSAL, Auth0, Firebase Auth) where tokens are stored in sessionStorage rather than cookies. Playwright's built-in storageState() only captures cookies and localStorage — sessionStorage is normally lost.
CSP Analyser extends the storage state format to also capture and restore sessionStorage. When you use --save-storage-state, sessionStorage is captured through multiple mechanisms to ensure no tokens are lost:
beforeunloadhandler (viaaddInitScript) — captures the final state right before each page closes, surviving all navigations during the auth flowload+ 1s delay — catches async MSAL token writes after auth redirects- 5-second periodic snapshots — catches silent token refreshes between page loads
The captured entries are written into the JSON file alongside cookies and localStorage.
When you later use --storage-state to load the file, CSP Analyser:
- Passes the file to Playwright (restoring cookies + localStorage as normal)
- Reads the
sessionStorageextension from the file - Registers an
addInitScriptthat callssessionStorage.setItem()for each entry — this runs before any page JavaScript, so tokens are available when frameworks like MSAL initialize
This means MSAL access tokens, ID tokens, and refresh tokens are restored correctly:
# Step 1: Log in via Azure AD B2C and save everything
csp-analyser interactive https://app.example.com --save-storage-state auth.json
# Step 2: Headless crawl with full auth state (including sessionStorage tokens)
csp-analyser crawl https://app.example.com --storage-state auth.jsonTIP
If the crawl still redirects to login, the tokens have likely expired. MSAL access tokens typically expire after 1 hour. Re-run the interactive login to generate a fresh storage state file.
Cross-origin sessionStorage
Only sessionStorage entries for the target origin are restored. Cross-origin sessionStorage (e.g. from the identity provider domain) cannot be injected and is not useful for CSP crawling.
How CSP injection works with auth redirects
When you analyse an authenticated site, the browser navigates through external identity providers (Azure AD B2C, Okta, Auth0, etc.) before returning to your app. CSP Analyser needs to inject its deny-all CSP header on your app's pages — but not on the IdP's pages, which would pollute the generated policy with violations from the IdP's own resources.
The redirect challenge
CSP Analyser intercepts all HTTP requests via Playwright's route handler and injects CSP headers only on responses from the target origin. Requests to other origins (the IdP login page, intermediate auth endpoints) pass through unmodified.
The challenge is that Playwright's route handler does not re-intercept requests that result from HTTP 302 redirects. When the IdP responds with 302 Location: https://your-app.com/auth/callback, the browser follows the redirect internally and the route handler never sees the callback request — so CSP headers are never injected on the post-auth page.
The workaround
CSP Analyser works around this by intercepting non-target-origin navigation requests and replacing HTTP redirect responses (301/302/303) with a small HTML page that performs the same redirect via JavaScript (window.location.href). The browser processes any Set-Cookie headers from the original response (preserving auth state), then navigates to the redirect target as a fresh request that the route handler can intercept and inject CSP on.
This happens transparently — you won't notice the intermediate page during the auth flow.
Multi-hop redirect chains
Some identity providers redirect through multiple intermediate endpoints before returning to your app (e.g., auth/authorize → ProcessAuth → kmsi → app/callback). Each hop is handled individually: the route handler rewrites each redirect as a JS navigation, so the browser's full cookie jar is available at every step. Cookie-dependent intermediate endpoints work correctly.
Known limitation: 307/308 redirects
HTTP 307 and 308 redirects preserve the request method and body (unlike 302 which converts to GET). When such a redirect targets your app's origin, CSP Analyser rewrites it as a JS redirect to ensure CSP injection, which converts the request to GET. This can change the behaviour of callback endpoints that depend on POST semantics.
In practice this rarely matters:
- OAuth 2.0 and OIDC specify 302 for authorization responses — 307/308 on the auth return leg is essentially non-existent
- SPA callback endpoints typically accept GET (the authorization code is in the URL query string)
- The alternative (no CSP injection) would silently miss all violations on the post-auth page
For intermediate hops between external origins, 307/308 are passed through unchanged to preserve method and body semantics.
Security notes
DANGER
Storage state files and cookies contain session secrets. Never commit them to version control.
- Add
auth.jsonand*.storage-state.jsonto your.gitignore - Storage state files can contain localStorage and sessionStorage data, which may include auth tokens, user data, or API keys
- The
--storage-statepath is resolved through symlinks usingfs.realpathSync()to prevent symlink-based path traversal attacks - Cookie values are validated against RFC 6265 to prevent header injection
FAQ
How long does a storage state file stay valid?
It depends on the site's session expiry. Most session cookies expire after hours or days. If a crawl fails with authentication errors, regenerate the storage state by logging in again with csp-analyser interactive --save-storage-state auth.json.
Can I use OAuth / SSO / MFA?
Yes — use the interactive command to log in through any browser-based flow (OAuth redirects, SAML SSO, TOTP/MFA). Once logged in, save the session with --save-storage-state for reuse in headless crawls.
Does cookie injection work with the CLI?
Cookie injection is currently only available through the MCP tools. The CLI supports storage state files (--storage-state) and interactive login. For CLI workflows, generate a storage state file first, then pass it to crawl or audit.
Can I analyse multiple authenticated sites in one session?
No. Each session targets a single URL origin. Run separate crawls for each site, each with its own storage state file or cookie set.