Blog
How Unbrowse Works: Architecture of an API Discovery Engine
A complete technical walkthrough of Unbrowse's architecture, from passive traffic capture through endpoint extraction, authentication detection, graph construction, and cache-first resolution.
How Unbrowse Works: Architecture of an API Discovery Engine
Unbrowse is not a browser. It is not a scraper. It is not a proxy. Unbrowse is an API discovery engine that happens to use a browser as its sensor.
When you browse a website through Unbrowse, the system captures every network request, extracts structured API endpoints, detects authentication patterns, builds a typed endpoint graph, and publishes it to a marketplace. When an AI agent later needs data from that website, Unbrowse resolves the request against the cache in under 100 milliseconds instead of spending 8+ seconds rendering a page.
This article walks through the complete architecture: every component, every pipeline stage, every design decision.
The Stack
Unbrowse has four layers:
- Kuri — A Zig-native CDP (Chrome DevTools Protocol) broker. 464KB binary. ~3ms cold start. This is the browser runtime.
- Capture Layer — Passive HAR recording + JavaScript fetch/XHR interceptor. Records all network traffic during a browse session.
- Intelligence Layer — Endpoint extraction, auth detection, schema inference, graph construction, semantic description. This is where raw traffic becomes structured API knowledge.
- Marketplace — Cloudflare Worker API that stores, ranks, and serves endpoint graphs. Cache-first resolution for AI agent requests.
Layer 1: Kuri — The Browser Runtime
Kuri is not Puppeteer. It is not Playwright. It is a purpose-built CDP broker written in Zig that does exactly three things:
- Launches Chrome with the right flags
- Manages CDP connections to browser tabs
- Executes DevTools Protocol commands
Why Zig? Because Kuri needs to be bundled inside the Unbrowse npm package and work on every platform without external dependencies. The entire binary is 464KB. It cold-starts in ~3 milliseconds. Compare this to Playwright's 50MB+ browser download or Puppeteer's Node.js overhead.
Kuri always runs with HEADLESS=false. This is not a headless automation tool — it is a real browser with a real UI. The non-headless mode enables two critical capabilities:
- Stealth extension — Kuri loads an anti-bot extension that patches common fingerprinting vectors (navigator.webdriver, chrome.runtime, WebGL renderer strings). This lets Unbrowse browse sites that block headless browsers.
- Persistent Chrome profile —
--user-data-dirmaintains cookies, localStorage, and session state across browsing sessions. You log in once; Kuri remembers.
Cookie Injection
When a browse session starts (unbrowse go <url>), Kuri extracts cookies from the user's real Chrome or Firefox SQLite database and injects them into the browsing tab via CDP's Network.setCookie. This means you do not need to re-authenticate through Unbrowse — your existing browser sessions carry over.
This is critical for mining. Many valuable endpoints require authentication. Without cookie injection, miners would need to manually log in to every site through Unbrowse. With it, mining is completely passive.
Layer 2: Capture — Dual-Source Traffic Recording
Capture is the foundation of everything. If we miss a network request, we miss an endpoint. Two independent capture mechanisms run simultaneously:
HAR Recording (CDP-Based)
Kuri enables CDP network events (Network.enable) and records every request/response pair as a HAR (HTTP Archive) entry. This captures:
- Request URL, method, headers, body
- Response status, headers, body
- Timing information
- Request initiator chain (which script triggered this request)
HAR capture is reliable for traditional page loads and synchronous XHR. But it has a blind spot: some SPAs fire async fetch requests that CDP's network events miss, especially those initiated during complex state transitions or Web Worker contexts.
JavaScript Interceptor (DOM-Based)
To catch what HAR misses, Unbrowse injects an interceptor script (INTERCEPTOR_SCRIPT) into every page via CDP's Page.addScriptToEvaluateOnNewDocument. This script monkey-patches window.fetch and XMLHttpRequest.prototype.open to log every outgoing request:
// Simplified interceptor logic
const originalFetch = window.fetch;
window.fetch = async function(...args) {
const request = new Request(...args);
logRequest(request.url, request.method, request.headers);
const response = await originalFetch.apply(this, args);
logResponse(request.url, response.status, response.headers);
return response;
};
The interceptor catches requests that CDP misses and provides client-side timing data.
Merging Sources
When a browse session ends (tab closed or navigation), both capture sources are merged. Deduplication uses URL + method + timestamp proximity as the merge key. HAR entries are preferred when both sources capture the same request (HAR has richer metadata), but interceptor-only entries fill the gaps.
This dual-source approach catches over 99% of all API calls made by any website.
Layer 3: Intelligence — From Traffic to Knowledge
Raw network traffic is noise. The intelligence layer transforms it into structured API knowledge through a six-stage pipeline.
Stage 1: Endpoint Extraction
extractEndpoints takes merged traffic and identifies API endpoints. Not every network request is an API call. The extractor filters out:
- Static assets (images, CSS, fonts, JS bundles)
- Tracking pixels and analytics beacons
- CDN resources
- WebSocket connections (handled separately)
What remains are API calls. For each one, the extractor produces:
- URL template — Parameterized URL with path variables identified (e.g.,
/api/v1/repos/{owner}/{repo}/issues) - Method — GET, POST, PUT, DELETE, PATCH
- Query parameters — Typed and annotated (required vs. optional, enum values, default values)
- Request body schema — For POST/PUT, the body structure is inferred from the captured data
- Response schema — JSON structure with types, nullable fields, and array detection
URL template extraction is the most critical step. The system must distinguish between path parameters (variable) and path segments (fixed). For example, in /r/MachineLearning/hot.json, is MachineLearning a parameter or a fixed segment? The extractor uses frequency analysis across multiple requests to the same base path and semantic heuristics to determine this.
Stage 2: Auth Detection
extractAuthHeaders scans request headers across all captured endpoints for authentication patterns:
- Bearer tokens —
Authorization: Bearer <token> - API keys — Custom headers like
X-API-Key,Api-Key, or query parameters like?key= - Cookies — Session cookies that are required for authenticated endpoints
- OAuth tokens — Tokens in Authorization headers with OAuth-specific patterns
- CSRF tokens — Cross-site request forgery tokens in headers or hidden form fields
The detector classifies auth into 15+ SSO provider categories (Google, GitHub, Microsoft, Okta, Auth0, etc.) and records which endpoints require which auth type.
Stage 3: Credential Storage
storeCredential saves extracted auth credentials to a local secure vault, keyed by domain. When Unbrowse later needs to execute a cached route, it can attach the correct auth credentials without re-extraction.
Credentials are stored locally only. They are never transmitted to the marketplace. The marketplace publishes the auth requirement (e.g., "requires Bearer token for github.com") but not the credential itself.
Stage 4: Endpoint Graph Construction
buildSkillOperationGraph is where Unbrowse's architecture diverges most from traditional API documentation. Instead of treating endpoints as a flat list, Unbrowse builds a typed directed graph.
Edge types include:
- Parent/Child — An endpoint that must be called before another (e.g., list repos before getting repo details)
- Pagination — Sequential calls with offset/cursor parameters
- Auth Dependency — An endpoint that provides tokens required by other endpoints
- Prefetch — Endpoints that are typically called together (e.g., user profile + user repos)
- Data Dependency — An endpoint whose response contains IDs needed by another endpoint's request
The graph structure enables intelligent resolution. When an agent requests "get issues for the unbrowse-ai/unbrowse repo," the graph knows that this requires first resolving the repo endpoint, extracting the repo ID, and then calling the issues endpoint with that ID.
Stage 5: Semantic Description
generateLocalDescription and augmentEndpointsWithAgent add human-readable (and LLM-readable) descriptions to every endpoint. The local description uses heuristics — URL patterns, response field names, HTTP methods — to generate a first pass. The agent augmentation step uses an LLM to produce richer descriptions:
- What data this endpoint returns
- When you would use it
- What parameters affect the response
- How it relates to other endpoints in the graph
These descriptions are critical for resolution. When an agent asks for "trending Python repos," the marketplace needs to match that intent against endpoint descriptions to find the right route.
Stage 6: Marketplace Publish
cachePublishedSkill and queueBackgroundIndex publish the complete endpoint graph to the Unbrowse marketplace. The published artifact includes:
- All endpoints with URL templates, schemas, and descriptions
- The full endpoint graph with typed edges
- Auth requirements (not credentials)
- Reliability metadata (initial score based on capture quality)
- Miner identity (for x402 payment routing)
Layer 4: Marketplace — Cache-First Resolution
The marketplace is a Cloudflare Worker that stores endpoint graphs and resolves agent requests against them.
Resolution Pipeline
When an agent calls unbrowse resolve "get trending repos from GitHub", the resolution follows a priority cascade:
- Local route cache — Check if this exact request has been resolved before and the cached result is still fresh. Sub-1ms.
- Marketplace search — Semantic search across all published endpoint graphs. Match intent against endpoint descriptions, URL templates, and schema metadata. 50-200ms.
- First-pass browser — If no cached route matches, Unbrowse opens a browser session, navigates to the likely URL, and captures endpoints in real time. 5-8 seconds.
- Browse session handoff — If first-pass capture does not produce a matching endpoint, the session is handed off to the calling agent. The agent drives the browser manually while Unbrowse indexes passively.
The goal is to move as many requests as possible from step 3-4 (slow, expensive) to step 1-2 (fast, cheap). Every successful first-pass capture improves future resolution speed.
Verification Loop
Published endpoints are not static. Websites change their APIs. Endpoints go down. Response schemas drift.
The marketplace runs a verification loop every 6 hours against every published endpoint:
- Execute the endpoint with cached auth credentials
- Compare the response schema against the published schema
- Update the EMA-based reliability score
- If schema drift is detected, flag for re-indexing
- If the endpoint returns errors consistently, reduce its marketplace ranking
The EMA (Exponential Moving Average) scoring means recent failures weigh more heavily than historic successes. An endpoint that was reliable for months but started failing today will see its score drop quickly.
Ranking
When multiple endpoint graphs match a resolve request, the marketplace ranks them by:
- Reliability score — Higher is better. Verified, stable endpoints rank first.
- Graph completeness — More complete graphs (more endpoints, more edge types) rank higher.
- Schema quality — Well-typed, consistent schemas rank higher.
- Freshness — Recently verified endpoints rank higher.
- Miner reputation — Miners with a history of high-quality contributions get a ranking boost.
Design Decisions
Several non-obvious design choices shape the architecture:
Why a Real Browser (Not a Proxy)
Unbrowse could theoretically work as a proxy — intercept traffic from any browser and index it. We chose the embedded browser approach because:
- Proxies break HTTPS pinning on many sites
- Proxies cannot inject the JavaScript interceptor
- Proxies cannot inject cookies from the user's real browser
- A controlled browser environment ensures consistent capture quality
Why Endpoint Graphs (Not Flat Lists)
Flat API documentation (OpenAPI/Swagger) treats every endpoint as independent. In practice, APIs have complex dependency chains. You cannot call /repos/{owner}/{repo}/issues without first knowing the owner and repo values, which come from /user/repos or /search/repositories.
The graph structure encodes these dependencies explicitly, enabling Unbrowse to automatically chain multi-step API calls.
Why Passive Capture (Not Active Probing)
Unbrowse does not probe websites looking for endpoints. It only captures traffic that the user's normal browsing generates. This is a deliberate choice:
- Passive capture is legally clear — you are recording your own browser's traffic
- Passive capture does not trigger rate limits or anti-bot measures
- The captured endpoints are guaranteed to be functional (the user just used them)
- The auth context is real (captured from an actual session)
Why Zig for Kuri (Not Node.js or Rust)
Kuri needs to be distributed inside an npm package and run on macOS, Linux, and Windows without compilation. Zig compiles to a static binary with zero runtime dependencies. The 464KB output is small enough to bundle in npm. Rust would also work but produces larger binaries. Node.js would add a runtime dependency. Zig hits the sweet spot of small binary, fast startup, and cross-platform support.
Performance
Benchmark data from the arXiv paper (https://arxiv.org/abs/2604.00694):
- Mean speedup over browser automation: 3.6x
- Median speedup: 5.4x
- Domains tested: 94
- Cached route latency: Sub-100ms
- Kuri cold start: ~3ms
- Endpoint extraction time: 200-500ms per session
- Full enrichment pipeline: 2-5 seconds per session
Open Source
Unbrowse's skill engine, capture layer, and CLI are open source. The marketplace backend runs on Cloudflare Workers. Kuri is available as a bundled binary within the npm package.
Explore the architecture: npx unbrowse setup
Read the paper: https://arxiv.org/abs/2604.00694