Blog

How Unbrowse Works: Architecture of an API Discovery Engine

Name: Unbrowse
Author: Unbrowse

A complete technical walkthrough of Unbrowse's architecture, from passive traffic capture through endpoint extraction, authentication detection, graph construction, and cache-first resolution.

Lewis Tham

April 3, 2026

How Unbrowse Works: Architecture of an API Discovery Engine

Unbrowse is not a browser. It is not a scraper. It is not a proxy. Unbrowse is an API discovery engine that happens to use a browser as its sensor.

When you browse a website through Unbrowse, the system captures every network request, extracts structured API endpoints, detects authentication patterns, builds a typed endpoint graph, and publishes it to a marketplace. When an AI agent later needs data from that website, Unbrowse resolves the request against the cache in under 100 milliseconds instead of spending 8+ seconds rendering a page.

This article walks through the complete architecture: every component, every pipeline stage, every design decision.

The Stack

Unbrowse has four layers:

Kuri — A Zig-native CDP (Chrome DevTools Protocol) broker. 464KB binary. ~3ms cold start. This is the browser runtime.
Capture Layer — Passive HAR recording + JavaScript fetch/XHR interceptor. Records all network traffic during a browse session.
Intelligence Layer — Endpoint extraction, auth detection, schema inference, graph construction, semantic description. This is where raw traffic becomes structured API knowledge.
Marketplace — Cloudflare Worker API that stores, ranks, and serves endpoint graphs. Cache-first resolution for AI agent requests.

Layer 1: Kuri — The Browser Runtime

Kuri is not Puppeteer. It is not Playwright. It is a purpose-built CDP broker written in Zig that does exactly three things:

Launches Chrome with the right flags
Manages CDP connections to browser tabs
Executes DevTools Protocol commands

Why Zig? Because Kuri needs to be bundled inside the Unbrowse npm package and work on every platform without external dependencies. The entire binary is 464KB. It cold-starts in ~3 milliseconds. Compare this to Playwright's 50MB+ browser download or Puppeteer's Node.js overhead.

Kuri always runs with HEADLESS=false. This is not a headless automation tool — it is a real browser with a real UI. The non-headless mode enables two critical capabilities:

Stealth extension — Kuri loads an anti-bot extension that patches common fingerprinting vectors (navigator.webdriver, chrome.runtime, WebGL renderer strings). This lets Unbrowse browse sites that block headless browsers.
Persistent Chrome profile — --user-data-dir maintains cookies, localStorage, and session state across browsing sessions. You log in once; Kuri remembers.

Cookie Injection

When a browse session starts (unbrowse go <url>), Kuri extracts cookies from the user's real Chrome or Firefox SQLite database and injects them into the browsing tab via CDP's Network.setCookie. This means you do not need to re-authenticate through Unbrowse — your existing browser sessions carry over.

This is critical for mining. Many valuable endpoints require authentication. Without cookie injection, miners would need to manually log in to every site through Unbrowse. With it, mining is completely passive.

Layer 2: Capture — Dual-Source Traffic Recording

Capture is the foundation of everything. If we miss a network request, we miss an endpoint. Two independent capture mechanisms run simultaneously:

HAR Recording (CDP-Based)

Kuri enables CDP network events (Network.enable) and records every request/response pair as a HAR (HTTP Archive) entry. This captures:

Request URL, method, headers, body
Response status, headers, body
Timing information
Request initiator chain (which script triggered this request)

HAR capture is reliable for traditional page loads and synchronous XHR. But it has a blind spot: some SPAs fire async fetch requests that CDP's network events miss, especially those initiated during complex state transitions or Web Worker contexts.

JavaScript Interceptor (DOM-Based)

To catch what HAR misses, Unbrowse injects an interceptor script (INTERCEPTOR_SCRIPT) into every page via CDP's Page.addScriptToEvaluateOnNewDocument. This script monkey-patches window.fetch and XMLHttpRequest.prototype.open to log every outgoing request:

// Simplified interceptor logic
const originalFetch = window.fetch;
window.fetch = async function(...args) {
  const request = new Request(...args);
  logRequest(request.url, request.method, request.headers);
  const response = await originalFetch.apply(this, args);
  logResponse(request.url, response.status, response.headers);
  return response;
};

The interceptor catches requests that CDP misses and provides client-side timing data.

Merging Sources

When a browse session ends (tab closed or navigation), both capture sources are merged. Deduplication uses URL + method + timestamp proximity as the merge key. HAR entries are preferred when both sources capture the same request (HAR has richer metadata), but interceptor-only entries fill the gaps.

This dual-source approach catches over 99% of all API calls made by any website.

Layer 3: Intelligence — From Traffic to Knowledge

Raw network traffic is noise. The intelligence layer transforms it into structured API knowledge through a six-stage pipeline.

Stage 1: Endpoint Extraction

extractEndpoints takes merged traffic and identifies API endpoints. Not every network request is an API call. The extractor filters out:

Static assets (images, CSS, fonts, JS bundles)
Tracking pixels and analytics beacons
CDN resources
WebSocket connections (handled separately)

What remains are API calls. For each one, the extractor produces:

URL template — Parameterized URL with path variables identified (e.g., /api/v1/repos/{owner}/{repo}/issues)
Method — GET, POST, PUT, DELETE, PATCH
Query parameters — Typed and annotated (required vs. optional, enum values, default values)
Request body schema — For POST/PUT, the body structure is inferred from the captured data
Response schema — JSON structure with types, nullable fields, and array detection

URL template extraction is the most critical step. The system must distinguish between path parameters (variable) and path segments (fixed). For example, in /r/MachineLearning/hot.json, is MachineLearning a parameter or a fixed segment? The extractor uses frequency analysis across multiple requests to the same base path and semantic heuristics to determine this.

Stage 2: Auth Detection

extractAuthHeaders scans request headers across all captured endpoints for authentication patterns:

Bearer tokens — Authorization: Bearer <token>
API keys — Custom headers like X-API-Key, Api-Key, or query parameters like ?key=
Cookies — Session cookies that are required for authenticated endpoints
OAuth tokens — Tokens in Authorization headers with OAuth-specific patterns
CSRF tokens — Cross-site request forgery tokens in headers or hidden form fields

The detector classifies auth into 15+ SSO provider categories (Google, GitHub, Microsoft, Okta, Auth0, etc.) and records which endpoints require which auth type.

Stage 3: Credential Storage

storeCredential saves extracted auth credentials to a local secure vault, keyed by domain. When Unbrowse later needs to execute a cached route, it can attach the correct auth credentials without re-extraction.

Credentials are stored locally only. They are never transmitted to the marketplace. The marketplace publishes the auth requirement (e.g., "requires Bearer token for github.com") but not the credential itself.

Stage 4: Endpoint Graph Construction

buildSkillOperationGraph is where Unbrowse's architecture diverges most from traditional API documentation. Instead of treating endpoints as a flat list, Unbrowse builds a typed directed graph.

Edge types include:

Parent/Child — An endpoint that must be called before another (e.g., list repos before getting repo details)
Pagination — Sequential calls with offset/cursor parameters
Auth Dependency — An endpoint that provides tokens required by other endpoints
Prefetch — Endpoints that are typically called together (e.g., user profile + user repos)
Data Dependency — An endpoint whose response contains IDs needed by another endpoint's request

The graph structure enables intelligent resolution. When an agent requests "get issues for the unbrowse-ai/unbrowse repo," the graph knows that this requires first resolving the repo endpoint, extracting the repo ID, and then calling the issues endpoint with that ID.

Stage 5: Semantic Description

generateLocalDescription and augmentEndpointsWithAgent add human-readable (and LLM-readable) descriptions to every endpoint. The local description uses heuristics — URL patterns, response field names, HTTP methods — to generate a first pass. The agent augmentation step uses an LLM to produce richer descriptions:

What data this endpoint returns
When you would use it
What parameters affect the response
How it relates to other endpoints in the graph

These descriptions are critical for resolution. When an agent asks for "trending Python repos," the marketplace needs to match that intent against endpoint descriptions to find the right route.

Stage 6: Marketplace Publish

cachePublishedSkill and queueBackgroundIndex publish the complete endpoint graph to the Unbrowse marketplace. The published artifact includes:

All endpoints with URL templates, schemas, and descriptions
The full endpoint graph with typed edges
Auth requirements (not credentials)
Reliability metadata (initial score based on capture quality)
Miner identity (for x402 payment routing)

Layer 4: Marketplace — Cache-First Resolution

The marketplace is a Cloudflare Worker that stores endpoint graphs and resolves agent requests against them.

Resolution Pipeline

When an agent calls unbrowse resolve "get trending repos from GitHub", the resolution follows a priority cascade:

Local route cache — Check if this exact request has been resolved before and the cached result is still fresh. Sub-1ms.
Marketplace search — Semantic search across all published endpoint graphs. Match intent against endpoint descriptions, URL templates, and schema metadata. 50-200ms.
First-pass browser — If no cached route matches, Unbrowse opens a browser session, navigates to the likely URL, and captures endpoints in real time. 5-8 seconds.
Browse session handoff — If first-pass capture does not produce a matching endpoint, the session is handed off to the calling agent. The agent drives the browser manually while Unbrowse indexes passively.

The goal is to move as many requests as possible from step 3-4 (slow, expensive) to step 1-2 (fast, cheap). Every successful first-pass capture improves future resolution speed.

Verification Loop

Published endpoints are not static. Websites change their APIs. Endpoints go down. Response schemas drift.

The marketplace runs a verification loop every 6 hours against every published endpoint:

Execute the endpoint with cached auth credentials
Compare the response schema against the published schema
Update the EMA-based reliability score
If schema drift is detected, flag for re-indexing
If the endpoint returns errors consistently, reduce its marketplace ranking

The EMA (Exponential Moving Average) scoring means recent failures weigh more heavily than historic successes. An endpoint that was reliable for months but started failing today will see its score drop quickly.

Ranking

When multiple endpoint graphs match a resolve request, the marketplace ranks them by:

Reliability score — Higher is better. Verified, stable endpoints rank first.
Graph completeness — More complete graphs (more endpoints, more edge types) rank higher.
Schema quality — Well-typed, consistent schemas rank higher.
Freshness — Recently verified endpoints rank higher.
Miner reputation — Miners with a history of high-quality contributions get a ranking boost.

Design Decisions

Several non-obvious design choices shape the architecture:

Why a Real Browser (Not a Proxy)

Unbrowse could theoretically work as a proxy — intercept traffic from any browser and index it. We chose the embedded browser approach because:

Proxies break HTTPS pinning on many sites
Proxies cannot inject the JavaScript interceptor
Proxies cannot inject cookies from the user's real browser
A controlled browser environment ensures consistent capture quality

Why Endpoint Graphs (Not Flat Lists)

Flat API documentation (OpenAPI/Swagger) treats every endpoint as independent. In practice, APIs have complex dependency chains. You cannot call /repos/{owner}/{repo}/issues without first knowing the owner and repo values, which come from /user/repos or /search/repositories.

The graph structure encodes these dependencies explicitly, enabling Unbrowse to automatically chain multi-step API calls.

Why Passive Capture (Not Active Probing)

Unbrowse does not probe websites looking for endpoints. It only captures traffic that the user's normal browsing generates. This is a deliberate choice:

Passive capture is legally clear — you are recording your own browser's traffic
Passive capture does not trigger rate limits or anti-bot measures
The captured endpoints are guaranteed to be functional (the user just used them)
The auth context is real (captured from an actual session)

Why Zig for Kuri (Not Node.js or Rust)

Kuri needs to be distributed inside an npm package and run on macOS, Linux, and Windows without compilation. Zig compiles to a static binary with zero runtime dependencies. The 464KB output is small enough to bundle in npm. Rust would also work but produces larger binaries. Node.js would add a runtime dependency. Zig hits the sweet spot of small binary, fast startup, and cross-platform support.

Performance

Benchmark data from the arXiv paper (https://arxiv.org/abs/2604.00694):

Mean speedup over browser automation: 3.6x
Median speedup: 5.4x
Domains tested: 94
Cached route latency: Sub-100ms
Kuri cold start: ~3ms
Endpoint extraction time: 200-500ms per session
Full enrichment pipeline: 2-5 seconds per session

Open Source

Unbrowse's skill engine, capture layer, and CLI are open source. The marketplace backend runs on Cloudflare Workers. Kuri is available as a bundled binary within the npm package.

Explore the architecture: npx unbrowse setup

Read the paper: https://arxiv.org/abs/2604.00694