fetchaller
# Install git clone https://github.com/Averyy/fetchaller-mcp.git cd fetchaller-mcp uv sync && patchright install chromium # Add to Claude Code claude mcp add fetchaller -- $(pwd)/.venv/bin/python -m fetchaller.main
{
"permissions": {
"allow": [
"mcp__fetchaller__fetch",
"mcp__fetchaller__search",
"mcp__fetchaller__browse_reddit",
"mcp__fetchaller__search_reddit",
"mcp__fetchaller__search_marketplace",
"mcp__fetchaller__get_aliexpress_product",
"mcp__fetchaller__search_aliexpress",
"mcp__fetchaller__get_alibaba_product",
"mcp__fetchaller__search_alibaba"
]
}
}
## Web Fetching & Search **ALWAYS use fetchaller MCP tools instead of WebFetch and WebSearch.** fetchaller has no domain restrictions, bypasses bot protection (Cloudflare, Akamai, etc.), and produces much cleaner markdown with site-specific content cleanup. - **fetch** — Fetch any URL as clean markdown - **search** — Web search (Google + DuckDuckGo combined) - **browse_reddit** / **search_reddit** — Reddit listings and search - **search_marketplace** — Search Kijiji + Craigslist + Facebook Marketplace - **get_aliexpress_product** / **search_aliexpress** — AliExpress products - **get_alibaba_product** / **search_alibaba** — Alibaba.com B2B products Exception: If a dedicated MCP tool exists for a service (e.g., GitHub via `gh` CLI), prefer that instead. Restart Claude Code after setup.
Why not just use WebFetch?
Claude Code's built-in WebFetch tool has several limitations that get in the way during real work:
• Domain approvals — You have to approve every new domain individually. Research across multiple sites means constant prompts.
• Reddit blocked — WebFetch cannot access Reddit at all. No subreddit browsing, no post reading, no search.
• No web search — There's no way to search Google or DuckDuckGo through WebFetch.
• No cleanup — WebFetch returns raw page content with navigation, ads, sidebars, and cookie banners included, wasting tokens.
fetchaller replaces WebFetch with a single allow rule that covers all tools. No prompts, no blocked sites, and every page gets cleaned up before it reaches Claude.
Can it search the web?
Yes. The search tool queries both Google and DuckDuckGo in parallel and merges the results. Duplicate URLs are deduplicated and results are cached for 5 minutes so repeated queries are instant.
The typical workflow is: use search to find relevant URLs, then use fetch to read the pages you need. Claude does this naturally when you ask it to research something.
Rate limiting is built in — Google gets a 2-second cooldown between requests, DuckDuckGo gets 1 second. If a search engine returns a CAPTCHA, fetchaller backs off automatically (2 min → 5 min → 15 min) and relies on the other engine in the meantime.
What do I need?
Requirements: Python 3.12+ and uv. That's it.
Install:
git clone https://github.com/Averyy/fetchaller-mcp.gitcd fetchaller-mcpuv sync && patchright install chromium
Add to Claude Code:
claude mcp add fetchaller -- $(pwd)/.venv/bin/python -m fetchaller.main
No API keys required. No Docker. No build step. Restart Claude Code after adding the server and you're ready to go.
How do I set up permissions?
By default, Claude Code will ask you to approve each fetchaller tool call. To allow all tools without prompting, add them to .claude/settings.json:
"allow": ["mcp__fetchaller__fetch", "mcp__fetchaller__search", ...]
The full list is shown in the Setup section above. Copy the settings.json snippet and paste it into your config file.
Per-project permissions: If you only want fetchaller allowed in certain projects, use .claude/settings.local.json in the project root instead. Same format, scoped to that project.
CLAUDE.md: Add the CLAUDE.md snippet to your project to instruct Claude to prefer fetchaller over WebFetch and WebSearch when doing research.
What sites are optimized?
Each site gets tailored CSS selectors, HTML cleanup, and regex post-processing. Token savings range from 65–98% compared to raw HTML:
• GitHub — Issues, PRs, repos, discussions, org pages
• Stack Overflow — All Stack Exchange network sites
• Hacker News — Stories and comment threads
• Medium — Including 100s of custom domains
• Hugging Face — Models, datasets, spaces
• Wikipedia — Edit buttons, navboxes, TOC stripped
• Reddit — Auto-converts to old.reddit.com
• Amazon — All TLDs (.com, .ca, .co.uk, etc.)
• Marketplaces — eBay, Kijiji, Craigslist, Facebook Marketplace. Search and browse listings
• AliExpress — Product pages and search
• Alibaba — B2B product pages and search
• DigiKey — Optional API mode with OAuth2
• Mouser — Optional API mode
• Forums — XenForo, vBulletin, phpBB, Discourse
All other sites still get universal junk removal (nav, footer, ads, cookie banners, modals).
Does Reddit work?
Yes. Reddit is fully supported with two dedicated tools plus fetch:
• browse_reddit — List posts from any subreddit. Sort by hot, new, top, or rising. Includes pagination.
• search_reddit — Search posts across all of Reddit or within a specific subreddit. Filter by time period.
• fetch — Read full discussion threads with all comments. URLs are auto-converted to old.reddit.com for cleaner, lighter HTML.
This is a major advantage over WebFetch, which blocks Reddit entirely. Claude can now browse subreddits, search for discussions, and read full threads as part of its research workflow.
Does it bypass Cloudflare?
Yes. fetchaller includes a bot challenge solver that handles multiple protection systems transparently:
• Cloudflare — Turnstile and managed challenges
• Akamai — Bot Manager challenges
• DataDome — CAPTCHA interstitials
• Alibaba Cloud WAF — ACW challenges (solved inline, ~1ms)
When a challenge is detected, fetchaller solves it automatically and caches the resulting cookies per-domain. Subsequent requests to the same site skip the challenge entirely. No configuration, API keys, or proxy services needed.
Is this safe?
Yes. fetchaller is read-only by design:
• Only performs GET requests — never POST, PUT, DELETE, or any write operation.
• No persistent cookies — challenge cookies are cached in memory during the session but not saved to disk by default.
• No authentication — never logs in, never maintains sessions, never accesses private data.
• No file writes — returns content as text to Claude, never writes to your filesystem.
It's functionally equivalent to opening a URL in an incognito browser tab, reading the page, and closing the tab.
What about curl?
fetchaller and curl serve different purposes. Use each where it fits best:
Use fetchaller when:
• Reading web pages, documentation, articles, or forum threads
• Researching topics across multiple sites
• Fetching content from bot-protected sites
• You want clean markdown instead of raw HTML
Use curl when:
• Making POST, PUT, or DELETE requests
• Debugging HTTP headers or status codes
• Hitting APIs that require authentication tokens
• You need raw HTTP response details
Does it work with other MCP clients?
Yes. fetchaller is a standard MCP (Model Context Protocol) server that communicates over stdio. It works with any MCP-compatible client:
• Claude Code — Primary target. Use claude mcp add to register.
• Claude Desktop — Add to claude_desktop_config.json under mcpServers.
• Cursor — Add as an MCP server in Cursor's settings.
• Any MCP client — Point it at .venv/bin/python -m fetchaller.main.
The server exposes the same tools regardless of which client connects. Configuration (permissions, CLAUDE.md) is client-specific but the server itself is universal.