Blog

How to Access GitHub Data Faster Than the REST API

GitHub's REST API is rate-limited to 5,000 requests per hour. Learn how to discover GitHub's internal APIs from browsing traffic using Unbrowse — faster data access without rate limit constraints.

Lewis Tham
April 3, 2026

How to Access GitHub Data Faster Than the REST API

GitHub's REST API is well-documented and widely used. It's also rate-limited to 5,000 requests per hour for authenticated users (60/hour unauthenticated). For applications that need to scan repositories, analyze code patterns across organizations, or monitor activity across many repos, that limit is a constant bottleneck.

GitHub's GraphQL API (v4) helps by letting you fetch multiple resources in one request, but it has its own rate limits (5,000 points/hour) and complex query costs that make large-scale data access difficult.

There's a faster path that most developers don't know about.

The Problem with GitHub Data Access at Scale

GitHub's APIs are good for normal use but hit walls at scale:

  • REST API (v3): 5,000 requests/hour authenticated. Pagination means a repo with 1,000 issues requires ~10 requests just to list them. Scanning 100 repos means 1,000+ requests minimum. The limit resets hourly, creating 12-minute wait windows.
  • GraphQL API (v4): 5,000 points/hour. Complex queries cost more points. Nested queries (repos -> issues -> comments) can cost 50-100 points each, giving you effectively 50-100 complex queries per hour.
  • GitHub Apps: Higher rate limits (up to 15,000/hour for installations), but require app registration, installation flows, and JWT authentication. Significant setup overhead for simple data access.
  • GitHub Archive / GH Torrent: Historical data only. Delayed by hours to days. No real-time access.
  • Scraping github.com: GitHub uses Turbo (server-rendered HTML fragments) making traditional scraping brittle. Content structure changes frequently.

For AI agents that need to search code, analyze repositories, or monitor GitHub activity, the rate limits create artificial delays that slow down workflows.

Shadow APIs: The Alternative

Every time you visit github.com, your browser makes API calls behind the scenes. These internal endpoints return clean JSON data — the same data that renders the repository pages, search results, and activity feeds you see.

GitHub.com uses a mix of server-rendered HTML and client-side API calls. The interesting internal endpoints handle dynamic content: code search, repository insights, activity feeds, notification streams, and the Copilot interface. These endpoints are authenticated via your session cookies and aren't subject to the same rate limits as the REST/GraphQL APIs.

Unbrowse captures these shadow APIs automatically from real browsing sessions.

What Unbrowse Discovers on GitHub

Browsing GitHub through Unbrowse reveals internal API endpoints that complement the official APIs:

  • Code Search: GET /search/code?q={query}&type=code with internal JSON response mode — code search results with file paths, repository context, match highlights, and language detection. The internal search endpoint returns richer result metadata than the REST API's search endpoint.
  • Repository Insights: GET /{owner}/{repo}/graphs/contributors-data — contributor activity data with commit counts, additions, deletions over time. This endpoint powers the Contributors graph and returns complete time-series data.
  • Activity Feed: GET /dashboard/recent-activity — your personalized activity feed with push events, issue activity, PR reviews, and release notifications across all watched repos.
  • File Viewer: GET /{owner}/{repo}/blob/{branch}/{path}?raw=true combined with GET /{owner}/{repo}/file-list/{branch} — file tree navigation and content retrieval that's faster than the Contents API for browsing repository structures.
  • Notifications Stream: GET /notifications/beta/threads — real-time notification data with full context (issue body, PR diff stats, review comments) that the REST API's notifications endpoint doesn't include.

How It Works

npm install -g unbrowse
unbrowse resolve "search for MCP server implementations in TypeScript" --url https://github.com

The process:

  1. Cookie Injection: Unbrowse extracts your GitHub session cookies from your local browser. Your authenticated session provides access to private repos and organization resources you can see.
  2. Browse: Kuri opens GitHub with your real session. The stealth extensions prevent any fingerprinting issues.
  3. Capture: Internal API calls are intercepted as you navigate — search results, repo data, file contents, activity feeds.
  4. Index: Each endpoint is analyzed for URL templates, authentication patterns (session cookies + CSRF tokens), and response schemas. GitHub's internal endpoints are mapped to semantic intents.
  5. Cache and Execute: Indexed routes are stored. Future requests hit GitHub's internal endpoints directly — faster responses, different rate limit pools.

Performance

Metric GitHub REST API (v3) GitHub GraphQL (v4) Unbrowse (cached)
Speed ~150ms ~200ms <80ms
Rate limit 5,000/hour 5,000 points/hour Session-based
Code search results 30 per page, 1,000 max 100 per query Full results
Auth setup PAT or OAuth app PAT or OAuth app Browser cookies
Nested data Multiple requests Single query (costly) Pre-structured
Monthly cost Free Free Free (open source)

The speed advantage comes from two factors: cached route resolution eliminates DNS/TLS overhead, and GitHub's internal endpoints often return pre-structured data that the REST API requires multiple calls to assemble.

When to Use This Approach

AI agent code search: Your agent needs to find code examples, library usage patterns, or implementation references across GitHub. The internal code search endpoint returns richer context than the REST API and isn't subject to the same rate limits.

Repository analysis at scale: Analyze contributor patterns, commit activity, and code changes across many repositories. Internal endpoints like the contributor graph API return complete time-series data in a single call.

Activity monitoring: Track activity across repositories, organizations, or topics in real-time. The internal activity feed and notification endpoints provide richer context than the Events API.

Cross-repo exploration: When you need to navigate file trees, compare implementations, or trace dependencies across repositories, internal endpoints provide faster file-level access than the Contents API.

Getting Started

# Install Unbrowse globally
npm install -g unbrowse

# Run initial setup
unbrowse setup

# Make sure you're logged into GitHub in Chrome, then:
unbrowse resolve "search for React component libraries" --url https://github.com

# Explore specific repos
unbrowse resolve "show contributors for facebook/react" --url https://github.com

Unbrowse is open source and published on arXiv. It works as an MCP server for AI agents.

FAQ

Is this legal? Unbrowse uses your authenticated GitHub session and accesses the same endpoints your browser calls when you use github.com. You're accessing data you already have permission to see.

How is this different from scraping? Scraping parses GitHub's HTML (which uses Turbo fragments and is hard to parse reliably). Unbrowse calls the internal JSON endpoints — the same ones GitHub's frontend uses. Structured data, not HTML.

Does this bypass GitHub's rate limits? Unbrowse calls different endpoints than the REST/GraphQL APIs. These internal endpoints are rate-limited differently (they're designed for real-time browsing, not API consumption). You're not bypassing anything — you're using a different set of endpoints.

Can I access private repositories? Yes. Because Unbrowse uses your actual GitHub session, it can access any repository you have permission to view — including private repos and organization-internal repos.

Should I still use the official API? For write operations (creating issues, PRs, comments), the official API is the right choice. Unbrowse excels at read-heavy workloads where rate limits are the bottleneck — search, analysis, monitoring, and exploration.