Blog

How to Give Your LangChain Agent Direct API Access

Name: Unbrowse
Author: Unbrowse

Tutorial showing how to use Unbrowse as a LangChain tool to replace browser-based web access with direct API calls. Includes code examples, setup, and performance comparison.

Lewis Tham

April 3, 2026

How to Give Your LangChain Agent Direct API Access

Most LangChain agents access the web through browser tools -- Playwright, Selenium, or scraping libraries. Every web request means launching a browser, rendering a page, parsing HTML, and extracting data. This costs tokens, time, and compute.

There is a faster path. Unbrowse discovers the internal APIs behind websites and gives your LangChain agent direct access to structured data without touching a browser. This tutorial shows how to set it up.

The Problem with Browser-Based Web Tools

A typical LangChain agent using browser tools follows this flow for every web request:

Launch a headless browser (~2-5 seconds)
Navigate to the URL (~1-3 seconds)
Wait for JavaScript to render (~1-5 seconds)
Extract the full DOM or take a screenshot
Send the DOM/screenshot to the LLM (~20K-100K tokens)
LLM reasons about the page and decides next action
Repeat for every click, scroll, or navigation

A 10-step web workflow can consume over 500,000 tokens and take 30-60 seconds. For agents that access the same sites repeatedly, this cost is entirely redundant.

The API-Native Alternative

Unbrowse replaces this loop with direct API calls. It captures the network traffic behind any website, reverse-engineers the internal endpoints, and calls them directly on subsequent visits.

First visit: Unbrowse opens a browser, captures traffic, indexes the APIs (20-80 seconds) Every subsequent visit: Direct API call, structured data returned in under 200 milliseconds

The published benchmark (Internal APIs Are All You Need) shows 3.6x mean speedup over Playwright across 94 live domains.

Setup

Install Unbrowse

git clone --single-branch --depth 1 https://github.com/unbrowse-ai/unbrowse.git ~/unbrowse
cd ~/unbrowse && ./setup --host off

Verify it is running:

unbrowse health

Install the Python SDK

pip install unbrowse langchain langchain-openai

Creating the Unbrowse Tool for LangChain

Unbrowse exposes a local REST API on http://localhost:6969. We wrap it as a LangChain tool:

import json
import requests
from langchain.tools import Tool

UNBROWSE_URL = "http://localhost:6969"

def unbrowse_resolve(query: str) -> str:
    """Resolve a web intent to structured data via Unbrowse.
    
    Input should be a JSON string with 'intent' and 'url' fields.
    Example: {"intent": "get trending repositories", "url": "https://github.com/trending"}
    """
    try:
        params = json.loads(query)
    except json.JSONDecodeError:
        # If plain text, treat as intent with no specific URL
        params = {"intent": query}
    
    response = requests.post(
        f"{UNBROWSE_URL}/v1/intent/resolve",
        json=params,
        timeout=90
    )
    result = response.json()
    
    # If we got direct data, return it
    if "result" in result and result["result"]:
        return json.dumps(result["result"], indent=2)[:4000]
    
    # If we got available endpoints, return the summary
    if "available_endpoints" in result:
        endpoints = result["available_endpoints"][:5]
        summary = []
        for ep in endpoints:
            summary.append({
                "endpoint_id": ep.get("endpoint_id"),
                "skill_id": ep.get("skill_id"),
                "description": ep.get("description"),
                "action_kind": ep.get("action_kind"),
                "url": ep.get("url", "")[:100]
            })
        return json.dumps(summary, indent=2)
    
    return json.dumps(result, indent=2)[:4000]


def unbrowse_execute(query: str) -> str:
    """Execute a specific Unbrowse endpoint.
    
    Input should be a JSON string with 'skill_id' and 'endpoint_id' fields.
    Optionally include 'path', 'extract', and 'limit'.
    """
    params = json.loads(query)
    skill_id = params.pop("skill_id")
    
    response = requests.post(
        f"{UNBROWSE_URL}/v1/skills/{skill_id}/execute",
        json=params,
        timeout=30
    )
    result = response.json()
    return json.dumps(result, indent=2)[:4000]


def unbrowse_search(query: str) -> str:
    """Search the Unbrowse marketplace for available API routes.
    
    Input should be a search intent string.
    """
    response = requests.post(
        f"{UNBROWSE_URL}/v1/search",
        json={"intent": query},
        timeout=15
    )
    result = response.json()
    return json.dumps(result, indent=2)[:4000]


# Create LangChain tools
resolve_tool = Tool(
    name="unbrowse_resolve",
    func=unbrowse_resolve,
    description="""Resolve a web intent to structured data. Use this instead of 
    browsing websites. Input: JSON with 'intent' (what you want) and 'url' 
    (the website). Returns structured API data or a list of available endpoints.
    Example: {"intent": "get trending repos", "url": "https://github.com/trending"}"""
)

execute_tool = Tool(
    name="unbrowse_execute",
    func=unbrowse_execute,
    description="""Execute a specific API endpoint discovered by unbrowse_resolve.
    Input: JSON with 'skill_id', 'endpoint_id', and optionally 'path' (drill into 
    nested data), 'extract' (pick specific fields), 'limit' (cap results).
    Example: {"skill_id": "abc", "endpoint_id": "def", "limit": 10}"""
)

search_tool = Tool(
    name="unbrowse_search",
    func=unbrowse_search,
    description="""Search the Unbrowse marketplace for API routes other agents 
    have discovered. Use when you need data from a website but do not have a 
    specific URL. Input: a search query string."""
)

Building the Agent

Now create a LangChain agent that uses Unbrowse as its primary web access tool:

from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder

# Initialize the LLM
llm = ChatOpenAI(model="gpt-4o", temperature=0)

# Define the tools
tools = [resolve_tool, execute_tool, search_tool]

# Create a prompt that guides the agent to use Unbrowse
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a web research agent. Use unbrowse_resolve as your 
    primary tool for accessing websites. It returns structured API data 
    instead of raw HTML, making your responses faster and more accurate.
    
    Workflow:
    1. Use unbrowse_resolve with intent and URL to get data or discover endpoints
    2. If you get available_endpoints, pick the best match and use unbrowse_execute
    3. Use unbrowse_search to find routes when you don't have a specific URL
    
    Never try to scrape or parse HTML. Unbrowse gives you structured data directly."""),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

# Create and run the agent
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
    max_iterations=5
)

Example: Research Agent

Here is a complete example that builds a research agent capable of pulling data from multiple websites:

# Run the agent
result = agent_executor.invoke({
    "input": "Find the top trending GitHub repositories today and "
             "compare them with the top stories on Hacker News."
})

print(result["output"])

What happens under the hood:

The agent calls unbrowse_resolve for GitHub trending -- if cached, returns in <200ms
The agent calls unbrowse_resolve for Hacker News -- if cached, returns in <200ms
The agent combines the structured data and generates the comparison

Total time with browser tools: 30-60 seconds, 200K+ tokens Total time with Unbrowse (cached): 2-5 seconds, ~15K tokens

Example: Price Monitoring Agent

def create_price_monitor():
    """Agent that monitors prices across multiple sites."""
    
    result = agent_executor.invoke({
        "input": """Check the current price of the MacBook Pro M4 on:
        1. Apple.com
        2. Amazon.com
        3. Best Buy
        Report the prices in a comparison table."""
    })
    return result["output"]

The first run indexes each site. Subsequent runs resolve from cache, making scheduled price checks nearly instant.

Example: Multi-Source Data Aggregation

result = agent_executor.invoke({
    "input": """Compile a daily briefing:
    1. Top 5 Hacker News stories
    2. Trending GitHub repos in Python
    3. Latest posts from r/MachineLearning
    Format as a concise daily digest."""
})

This pattern is where Unbrowse delivers the most value. A daily briefing agent runs the same queries every day. After the first run indexes each site, every subsequent run completes in seconds instead of minutes.

Handling Authentication

For sites that require login, Unbrowse extracts cookies from your Chrome browser automatically. If you are logged into a site in Chrome, it just works.

For explicit login when cookies are not available:

def unbrowse_login(url: str) -> str:
    """Trigger interactive login for an authenticated site."""
    response = requests.post(
        f"{UNBROWSE_URL}/v1/auth/login",
        json={"url": url},
        timeout=120
    )
    return response.json()

# Add as a tool
login_tool = Tool(
    name="unbrowse_login",
    func=unbrowse_login,
    description="""Trigger interactive browser login for a site that 
    requires authentication. Opens a browser window for the user to 
    log in. Use when unbrowse_resolve returns auth_required."""
)

Performance Comparison

Here is a real comparison for a simple task: "Get the top 10 trending GitHub repositories."

With LangChain + Playwright

Action: playwright_navigate -> github.com/trending     (3.2s, 45K tokens)
Action: playwright_get_content                         (1.1s, 82K tokens)
Action: parse HTML and extract repos                   (0.5s, 12K tokens)
Total: 4.8 seconds, 139K tokens

With LangChain + Unbrowse (first visit)

Action: unbrowse_resolve -> github.com/trending        (24s, 8K tokens)
Total: 24 seconds, 8K tokens (one-time indexing cost)

With LangChain + Unbrowse (cached)

Action: unbrowse_resolve -> github.com/trending        (0.18s, 4K tokens)
Total: 0.18 seconds, 4K tokens

The first visit is slower because Unbrowse is indexing. But every subsequent request is 25x faster and uses 35x fewer tokens.

Tips for Production Use

Pre-warm your cache. Before deploying, run resolves against all the sites your agent will access. This moves the indexing cost to build time rather than runtime.

sites = [
    {"intent": "get trending repos", "url": "https://github.com/trending"},
    {"intent": "get top stories", "url": "https://news.ycombinator.com"},
    {"intent": "get front page", "url": "https://reddit.com"},
]
for site in sites:
    unbrowse_resolve(json.dumps(site))

Use the marketplace. Popular sites are likely already indexed by other agents. Your first resolve may be instant if someone else has already contributed the route.

Submit feedback. After successful resolves, submit feedback to improve route quality:

requests.post(f"{UNBROWSE_URL}/v1/feedback", json={
    "skill_id": skill_id,
    "endpoint_id": endpoint_id,
    "rating": 5,
    "outcome": "success"
})

Handle errors gracefully. If Unbrowse returns no cached route and capture fails, fall back to your existing browser tool rather than failing the task entirely.

The TypeScript SDK Alternative

If you are building in TypeScript, Unbrowse provides a native SDK:

import { Unbrowse } from "@unbrowse/sdk";

const unbrowse = new Unbrowse();
const result = await unbrowse.resolve({
  intent: "get trending repositories",
  url: "https://github.com/trending",
});

The SDK also exposes a Playwright-compatible Browser API:

import { Browser } from "unbrowse";

const browser = await Browser.launch();
const page = await browser.newPage();
const response = await page.goto("https://github.com/trending");
const data = await response.json();
await browser.close();

page.goto() checks the skill cache first. If a cached route exists, no browser tab opens and structured data is returned directly.

Conclusion

Giving your LangChain agent Unbrowse instead of browser tools changes the economics of web access. The first visit indexes. Every subsequent visit is a fast API call. For agents that run daily, weekly, or on demand against the same set of websites, this means 90% lower token costs and 30x faster resolution.

The setup takes five minutes. The payoff compounds with every cached route.

Get started: github.com/unbrowse-ai/unbrowse

Read the paper: Internal APIs Are All You Need