Blog

The arXiv Paper Explained: Internal APIs Are All You Need

An accessible summary of the Unbrowse research paper (arXiv:2604.00694) for developers. Key findings: 3.6x mean speedup over browser automation, 5.4x median, 94 domains tested, sub-100ms cached routes.

Lewis Tham
April 3, 2026

The arXiv Paper Explained: Internal APIs Are All You Need

In March 2026, we published a research paper on arXiv titled "Internal APIs Are All You Need" (arXiv:2604.00694). The paper presents a formal evaluation of a simple thesis: AI agents do not need to render web pages to get data from websites. They can call the same API endpoints the website's frontend calls — and it is dramatically faster.

This article summarizes the paper's key findings, methodology, and implications for developers building AI agents. No academic jargon. Just the results and what they mean.

The Thesis

Every modern website is a thin frontend that calls backend APIs. When you load Twitter, your browser does not receive a pre-rendered page of tweets. It receives a JavaScript shell that makes GraphQL calls to api.x.com, which returns structured JSON data, which the JavaScript then renders into the DOM.

Browser automation tools (Playwright, Puppeteer, Selenium) make AI agents repeat this entire process: launch a browser, load the JavaScript, execute the API calls, wait for rendering, then parse the DOM to extract the data that was already available as structured JSON before rendering began.

The paper asks: what if agents skipped the rendering step entirely and called the internal APIs directly?

The Methodology

We tested this approach across 94 domains spanning 12 categories:

  • Social media (Twitter/X, Reddit, LinkedIn, Facebook, Instagram)
  • Developer platforms (GitHub, GitLab, Stack Overflow, npm, PyPI)
  • E-commerce (Amazon, Shopify stores, eBay, Etsy)
  • Search engines (Google, Bing, DuckDuckGo)
  • News and media (HackerNews, Medium, Substack, major news sites)
  • SaaS dashboards (Notion, Slack, Linear, Jira, Figma)
  • Financial data (Yahoo Finance, Coinbase, Robinhood)
  • Travel (Booking.com, Airbnb, Google Flights)
  • Food and delivery (DoorDash, Uber Eats, Yelp)
  • Real estate (Zillow, Realtor.com, Redfin)
  • Government and public data (SEC EDGAR, data.gov)
  • AI and ML platforms (Hugging Face, Replicate, Weights & Biases)

For each domain, we performed identical data retrieval tasks using two methods:

  1. Browser automation — Playwright with standard configuration, navigating to the page and extracting data via DOM selectors
  2. Internal API — Direct HTTP calls to the shadow API endpoints discovered by Unbrowse, returning structured JSON

Each task was run 10 times. We measured end-to-end latency (from request initiation to data availability), response size, data completeness, and reliability.

Key Finding 1: 3.6x Mean Speedup

Across all 94 domains, internal API calls were 3.6 times faster than browser automation on average.

Metric Browser Automation Internal API Speedup
Mean latency 8,240ms 2,289ms 3.6x
Median latency 7,100ms 1,315ms 5.4x
P95 latency 18,500ms 4,200ms 4.4x
P99 latency 32,000ms 8,100ms 3.9x

The median speedup (5.4x) is higher than the mean (3.6x) because browser automation has a long tail of slow page loads — complex SPAs with heavy JavaScript, slow third-party scripts, and rendering-intensive pages. Internal APIs have more consistent latency because they skip all of that.

Where the Time Goes

Browser automation latency breaks down as:

  • Browser launch: 500-2,000ms (cold) or 50-100ms (warm pool)
  • DNS + TLS: 100-300ms
  • Page load (HTML + JS): 1,000-5,000ms
  • JavaScript execution: 500-3,000ms
  • API calls (triggered by JS): 200-2,000ms
  • Rendering: 500-2,000ms
  • DOM extraction: 100-500ms

Internal API latency breaks down as:

  • DNS + TLS: 100-300ms (same)
  • API call: 200-2,000ms (same)
  • JSON parsing: 1-10ms

The browser adds 3,000-12,000ms of overhead for browser launch, page load, JavaScript execution, and rendering — all of which is unnecessary if you are calling the API directly.

Key Finding 2: Cached Routes Under 100ms

When the internal API route is cached in Unbrowse's marketplace, latency drops dramatically:

Resolution Source Median Latency
Browser automation 7,100ms
Direct API call 1,315ms
Cached route (Unbrowse) 67ms

Cached route resolution adds only the time to look up the route in the cache and return the stored response. For data that does not change frequently (product listings, user profiles, repository metadata), cached routes are sufficient and deliver sub-100ms latency consistently.

Key Finding 3: Data Completeness

A common concern with shadow APIs is that they might not expose all the data visible on the website. We measured data completeness — the percentage of DOM-visible data fields that were also available in the shadow API response.

Domain Category Avg. Completeness
Social media 98.2%
Developer platforms 99.1%
E-commerce 96.7%
Search engines 94.3%
SaaS dashboards 99.4%
Financial data 97.8%
Overall 97.6%

Shadow APIs return 97.6% of the data visible on the rendered page. The missing 2.4% consists of:

  • Client-side computed values (relative timestamps like "2 hours ago")
  • UI-specific metadata (layout hints, A/B test variants)
  • Data from separate API calls that the extraction pipeline merged (addressable with endpoint graph traversal)

In many cases, the shadow API actually returns more data than is visible on the page — internal IDs, metadata fields, and related entity references that the frontend discards during rendering.

Key Finding 4: Reliability

We measured reliability as the percentage of attempts that returned the expected data without errors.

Method Reliability (10 runs per domain)
Browser automation 87.3%
Internal API (first call) 94.1%
Internal API (with retry) 99.2%
Cached route (Unbrowse) 99.7%

Browser automation's 87.3% reliability reflects the real-world challenges: anti-bot detection triggering on some runs, JavaScript errors, timeout on slow pages, and DOM structure changes between test runs.

Internal APIs are more reliable because they skip the rendering step where most failures occur. The remaining 5.9% failures in first-call API access come from authentication expiry (session cookies timing out), rate limiting, and transient server errors.

With Unbrowse's retry logic and auth refresh, reliability reaches 99.2%. Cached routes achieve 99.7% because they serve stored data without hitting the origin server.

Key Finding 5: Resource Usage

Browser instances are expensive. We measured resource consumption:

Resource Browser Automation Internal API
Memory per request 512-1,024 MB 5-15 MB
CPU (peak) 1.0-2.0 vCPU 0.01-0.05 vCPU
Network transfer 2-10 MB/page 5-100 KB/request
Concurrent capacity (4GB RAM) 4-8 instances 200+ requests

A single server with 4GB of RAM can run 4-8 concurrent browser instances for automation. The same server can handle 200+ concurrent internal API calls. This is a 25-50x improvement in throughput per dollar of compute.

Domain-Specific Results

Some domains showed particularly interesting results.

GitHub: 6.2x Speedup

GitHub's internal API closely mirrors its public API (api.github.com). The shadow API surface is rich (50+ endpoints discovered per session) and highly reliable. Browser automation is particularly slow on GitHub due to heavy JavaScript rendering of code views, diff displays, and issue threads.

Reddit: 4.8x Speedup

Reddit's shadow API (oauth.reddit.com) returns cleanly structured JSON for post listings, comments, user profiles, and search. The browser automation path is slow because Reddit's new UI is a heavy React application with multiple re-renders.

Amazon: 3.1x Speedup

Amazon's shadow API surface is massive (60+ endpoints) but complex. Product pages make 15+ internal API calls, some of which are interdependent. The speedup is lower than average because Amazon's API calls themselves are slow (the server does heavy personalization).

LinkedIn: 7.8x Speedup (Highest)

LinkedIn showed the highest speedup. Browser automation on LinkedIn is extremely slow due to aggressive anti-bot measures (which add delays), heavy JavaScript, and complex rendering. The Voyager API (LinkedIn's shadow API) returns clean JSON with rich professional data. However, auth complexity is high — LinkedIn's session management is among the most challenging to maintain.

Google Search: 2.1x Speedup (Lowest)

Google Search showed the lowest speedup because Google's rendering is already highly optimized — the page loads fast even in a browser. The shadow API surface is also sparser than expected; Google pre-renders more content server-side than most SPAs.

Implications for AI Agent Architecture

The paper draws several conclusions for developers building AI agents.

1. Browser Automation Is the Wrong Default

Most AI agent frameworks (LangChain, CrewAI, AutoGPT) default to browser automation for web interactions. The data shows this is 3.6x slower, 25-50x more resource-intensive, and less reliable than internal API access. Agent frameworks should default to API-first resolution and fall back to browser automation only when no API route exists.

2. The API Route Cache Is Critical Infrastructure

The 67ms cached route latency versus 7,100ms browser latency means a well-populated route cache is the single most impactful optimization for agent web access. Every route cached saves 7 seconds per access. At scale, this is the difference between agents that feel responsive and agents that feel broken.

3. Passive Discovery Scales Better Than Active Probing

We tested both active endpoint discovery (systematically probing URL patterns) and passive discovery (recording traffic from normal browsing). Passive discovery found 23% more endpoints on average because it captures endpoints that active probing misses — those triggered by user interactions, dynamic state, and personalization.

4. Endpoint Graphs Enable Multi-Step Resolution

Flat endpoint lists (like OpenAPI specs) miss the dependency relationships between endpoints. Our graph-based representation — with parent/child, pagination, auth dependency, and prefetch edges — enabled automatic multi-step resolution for 89% of complex queries that required chaining multiple API calls.

5. The Economics Enable a Self-Sustaining Marketplace

The x402 micropayment model turns endpoint discovery into a monetizable activity. Miners (users who browse and discover routes) earn from agents who consume those routes. The paper models the marketplace economics and shows that the system reaches equilibrium when the marketplace contains routes for the top 1,000 most-requested domains.

Limitations

The paper acknowledges several limitations:

  • Auth maintenance — Shadow API access requires valid authentication. Session cookies expire. OAuth tokens need refreshing. The system must continuously maintain auth state.
  • Schema drift — Shadow APIs can change without notice. The verification loop catches drift, but there is a window between a change and detection where stale routes may be served.
  • POST-heavy APIs — Some shadow APIs (especially GraphQL) use POST requests with complex bodies. These are harder to parameterize and template than GET requests.
  • Binary APIs — Endpoints returning protobuf or other binary formats require schema knowledge to decode. The current system handles JSON well but has limited binary format support.
  • Legal uncertainty — The legal status of calling undocumented APIs varies by jurisdiction and by website. The paper does not provide legal advice.

Try It Yourself

The paper's methodology is reproducible. The Unbrowse evaluation framework tests against the same 94 domains:

npx unbrowse setup
bun run eval:codex:product-success

The full paper is available at: https://arxiv.org/abs/2604.00694

The codebase is open source. The marketplace is live. The 3.6x speedup is waiting for your agents.

Internal APIs are all you need.