Blog
What Is a Shadow API? The Complete Developer's Guide
A comprehensive guide to shadow APIs: the internal, undocumented endpoints that power every modern website. Learn what they are, how to find them, and why they matter for AI agents and browser automation.
What Is a Shadow API? The Complete Developer's Guide
Open your browser's DevTools on any modern website. Click the Network tab. Reload the page. Watch the requests scroll by.
Between the HTML document, CSS files, JavaScript bundles, and images, you will see something else: dozens of XHR and Fetch requests returning JSON data. Product listings. User profiles. Search results. Comments. Recommendations. Analytics events.
These are shadow APIs — the internal endpoints that power the website's frontend. They are not documented. They are not versioned. They are not intended for external consumption. But they exist, they return structured data, and they are far more useful than the HTML they render.
Definition
A shadow API is any HTTP endpoint that a website uses internally to serve data to its own frontend, but does not expose as a public, documented API.
The term distinguishes these endpoints from:
- Public APIs — Documented, versioned, rate-limited endpoints intended for third-party developers (e.g., GitHub API, Stripe API, Twitter API v2)
- Private APIs — Internal microservice endpoints that are not exposed to the internet at all (e.g., an internal auth service behind a VPN)
- Backend-for-Frontend (BFF) — Architecture pattern where a backend serves a specific frontend; shadow APIs are the endpoints of the BFF that a browser can observe
Shadow APIs sit in a gray zone. They are exposed to the internet (your browser can reach them), they return structured data (usually JSON), and they are functionally identical to a public API — except nobody documented them.
Why Shadow APIs Exist
Every modern website is a single-page application (SPA) or uses SPA techniques for dynamic content. The architecture is always the same:
- Browser loads an HTML shell + JavaScript bundle
- JavaScript runs and makes API calls to fetch data
- Data arrives as JSON
- JavaScript renders the data into the DOM
The API calls in step 2 are shadow APIs. They exist because the website needs them to function. The developer built them for the frontend, not for you — but the browser is your tool, and those requests are visible.
The Scale of Shadow APIs
Consider what happens when you load a single Reddit page:
oauth.reddit.com/api/v1/me— Your user profileoauth.reddit.com/r/{subreddit}/hot.json?limit=25— Post listingoauth.reddit.com/api/morechildren— Comment tree expansionoauth.reddit.com/api/vote— Vote state for each postgql.reddit.com/— GraphQL queries for recommendationsoauth.reddit.com/api/trending_searches— Search suggestions
Six+ API endpoints from a single page load. Each returns structured, typed JSON. Each is a shadow API.
Now multiply this across the internet. Every SaaS dashboard, every e-commerce site, every social platform, every news site — they all have shadow APIs. Conservative estimates suggest there are hundreds of millions of shadow API endpoints across the public web.
Shadow APIs vs. Public APIs
Understanding the differences helps you work with shadow APIs effectively.
Structure
Public APIs are designed for developers. They have consistent URL patterns, versioned paths (/v1/, /v2/), standardized error responses, and pagination schemes.
Shadow APIs are designed for a specific frontend. They might use inconsistent naming, embed business logic in URL parameters, return different response shapes for the same endpoint based on query flags, or mix concerns that a public API would separate.
Authentication
Public APIs use explicit authentication: API keys, OAuth tokens, or webhook signatures. You get credentials through a developer portal.
Shadow APIs use whatever auth the frontend uses — usually session cookies set during login. If you are logged in to the website, your browser automatically authenticates every shadow API call. This is both simpler (no developer portal) and more complex (cookies are fragile, expire, and are tied to browser state).
Rate Limiting
Public APIs publish rate limits and return structured 429 responses with retry-after headers.
Shadow APIs usually have rate limits, but they are tuned for normal browsing behavior, not API consumption. Making 1,000 requests per minute to a shadow API will almost certainly trigger anti-bot measures, even though the same rate against a public API would be fine.
Stability
Public APIs are versioned. Breaking changes get new version numbers. Deprecation is announced months in advance.
Shadow APIs change whenever the frontend changes. A new UI release can rename endpoints, restructure response schemas, or eliminate endpoints entirely — with no notice. Shadow APIs are inherently less stable than public APIs.
Documentation
Public APIs have documentation. Shadow APIs have DevTools.
How to Find Shadow APIs
There are three approaches, from manual to fully automated.
Manual Discovery (DevTools)
The simplest approach:
- Open the target website in Chrome
- Open DevTools (F12) > Network tab
- Filter by XHR/Fetch
- Browse the site normally
- Click on each request to see URL, headers, request body, and response
This works for exploration but does not scale. You cannot manually catalog hundreds of endpoints across dozens of sites.
HAR Capture
A step up from manual inspection:
- Open DevTools > Network tab
- Browse the site
- Right-click the request list > Save all as HAR with content
- Parse the HAR file programmatically
HAR files contain every request/response pair in a structured JSON format. You can write scripts to extract endpoints, headers, and schemas from them.
Limitation: HAR capture through DevTools misses some async requests, especially those fired by Web Workers or during complex SPA state transitions.
Automated Discovery (Unbrowse)
Unbrowse automates the entire discovery pipeline:
- Browse any website through Unbrowse
- Dual-source capture (HAR + JavaScript interceptor) catches all API calls
- Endpoint extraction separates API calls from static assets
- URL template inference parameterizes variable path segments
- Schema inference types every request and response
- Auth detection classifies authentication requirements
- Graph construction maps relationships between endpoints
The result is a complete, typed, graph-structured catalog of every shadow API on the site — produced automatically from a normal browsing session.
Why Shadow APIs Matter for AI Agents
AI agents have a web interaction problem. When an agent needs data from a website, it has three options:
Option 1: Browser Automation
Launch a headless browser, navigate to the page, wait for rendering, then extract data from the DOM.
Problems:
- Slow (5-15 seconds per page)
- Fragile (DOM selectors break on any UI change)
- Resource-intensive (each browser instance uses 500MB+ RAM)
- Detectable (anti-bot systems flag headless browsers)
- Unstructured (you get HTML, not data)
Option 2: Public API
Call the site's public API, if one exists.
Problems:
- Many sites do not have public APIs
- Those that do require developer accounts, API keys, and rate limit management
- Public APIs often expose a subset of the data visible on the website
- Each API is different — no unified interface
Option 3: Shadow API (via Unbrowse)
Call the same API endpoints the website's frontend uses.
Advantages:
- Fast (direct HTTP call, no rendering — sub-100ms from cache)
- Structured (JSON responses with typed fields)
- Complete (every data point visible on the website is available via its shadow API)
- Authenticated (Unbrowse manages cookies and session state)
- Unified (Unbrowse provides a consistent interface across all sites)
Shadow APIs give AI agents the best of both worlds: the data completeness of browser automation with the speed and structure of a public API.
The Shadow API Lifecycle
Shadow APIs are not static. Understanding their lifecycle helps you work with them reliably.
Discovery
A shadow API endpoint is first captured when someone browses a website and a capture tool (DevTools, HAR recorder, Unbrowse) records the request. The endpoint is "discovered" when it is extracted, parameterized, and cataloged.
Verification
A discovered endpoint must be verified — can it be called independently (outside the context of the original page load) and return useful data? Some endpoints require specific request sequences, CSRF tokens, or referrer headers that make standalone calls fail.
Unbrowse's verification loop tests endpoints every 6 hours, tracking which ones work independently and which require dependency resolution.
Maturation
As more traffic flows through a shadow API endpoint, its metadata improves:
- Schema becomes more precise (seeing 1,000 responses reveals nullable fields, enum values, and edge cases)
- URL template parameters get better semantic labels
- Auth requirements are fully classified
- Pagination patterns are identified
- Related endpoints are linked in the graph
A mature shadow API endpoint in the Unbrowse marketplace has richer metadata than most public API documentation.
Drift
Websites update their frontends. When they do, shadow APIs can change:
- Endpoint URL might change (
/api/v1/to/api/v2/) - Response schema might add or remove fields
- Auth mechanism might change
- Endpoint might be removed entirely
Drift detection is critical. Unbrowse's verification loop catches drift automatically — schema changes trigger alerts, and endpoints that consistently fail get removed from the marketplace.
Deprecation
When a shadow API endpoint stops working permanently (the website removed it), it is deprecated in the marketplace. Its reliability score drops to zero, and it stops appearing in resolve results. If a new endpoint replaces it, the graph structure helps identify the replacement.
Common Shadow API Patterns
After cataloging endpoints across 500+ domains, patterns emerge.
REST-Style JSON APIs
The most common pattern. Standard HTTP methods, JSON responses, URL-based resource identification.
GET /api/products/{id}
GET /api/search?q={query}&page={page}
POST /api/cart/add
Examples: Amazon, Shopify stores, most SaaS dashboards.
GraphQL
A single endpoint that accepts query documents in the request body. Increasingly common on large platforms.
POST /graphql
Body: {"query": "{ user(id: 123) { name, email } }"}
Examples: GitHub (api.github.com/graphql), Facebook, Twitter/X, Reddit (gql.reddit.com), Shopify Admin.
BFF (Backend-for-Frontend) APIs
Custom endpoints designed for a specific frontend view. They aggregate data from multiple backend services into a single response.
GET /api/dashboard?include=stats,alerts,recent_activity
Examples: Most SaaS dashboards, admin panels.
Protobuf/Binary APIs
Some high-performance APIs use binary formats (Protocol Buffers, MessagePack) instead of JSON. These require schema knowledge to decode.
GET /api/data?format=proto
Response: <binary protobuf>
Examples: Google services, YouTube (InnerTube API).
Server-Sent Events / Streaming
Long-lived connections that stream updates. Not traditional request/response but still valuable shadow APIs.
GET /api/stream/notifications
Response: data: {"type": "new_message", "id": 456}
Examples: Chat platforms, real-time dashboards, notification systems.
Legal Considerations
Shadow API usage occupies a nuanced legal space. Key considerations:
- You are accessing data your browser already receives. No encryption is bypassed. No access controls are circumvented. The data is sent to your browser in the normal course of using the website.
- Terms of Service vary. Some websites explicitly prohibit automated access, even to endpoints your browser uses. Others have no such restrictions. Always review the ToS.
- Rate limits should be respected. Even though shadow APIs lack documented rate limits, hammering them with thousands of requests per minute is both technically harmful and legally risky.
- Authentication boundaries matter. Using your own session cookies to access data you are authorized to see is different from using stolen credentials to access data you are not.
Unbrowse's passive capture model helps here: it only captures traffic from your own browsing sessions, using your own credentials, at the rate you naturally browse.
Building With Shadow APIs
If you want to integrate shadow API data into your applications, Unbrowse provides the infrastructure:
# Discover shadow APIs by browsing
unbrowse go https://target-website.com
# List discovered endpoints
unbrowse skills list --domain target-website.com
# Resolve a data need against discovered routes
unbrowse resolve "get product details from target-website.com"
# Execute a specific cached route
unbrowse execute --skill target-website.com --operation getProduct --params '{"id": "12345"}'
For programmatic access, the MCP server integration lets any AI agent discover and use shadow APIs:
{
"mcpServers": {
"unbrowse": {
"command": "unbrowse",
"args": ["serve", "--mcp"]
}
}
}
The Future of Shadow APIs
Shadow APIs are not going away. The SPA architecture that creates them is the dominant web development paradigm, and that is unlikely to change. If anything, shadow APIs are becoming richer as frontend frameworks demand more structured data from backends.
The question is not whether shadow APIs will continue to exist, but whether they will be treated as a first-class data source — cataloged, verified, and made available through standard protocols.
Unbrowse's thesis is that they should be. The data flowing through shadow APIs is the same data that public APIs serve, often with better coverage and fresher results. Making that data accessible to AI agents through a verified, cached, paid marketplace is the natural evolution.
Every modern website already has an API. It is just not documented yet.
Start discovering: npx unbrowse setup