π₯ Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
-
Updated
Jun 30, 2026 - JavaScript
π₯ Official Firecrawl MCP Server - Adds powerful web scraping and search to Cursor, Claude and any other LLM clients.
Open source web infrastructure for AI. Scrape, crawl, and automate the web, clean markdown, browser sessions, ready for your agents.
Model Context Protocol (MCP) Server for Graphlit Platform
A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one package
Full-content web fetcher for AI agents β Chrome TLS fingerprinting, browser impersonation, and multi-strategy article extraction
A powerful MCP server extension providing web search and content extraction capabilities. Integrates DuckDuckGo search functionality and URL content extraction into your MCP environment, enabling AI assistants to search the web and extract webpage content programmatically.
Readability2 converts HTML to plain text.
Configurable web access extension for pi that routes search, contents, answers, and research across Claude, Codex, Exa, Gemini, Parallel, and Valyu providers.
Next.js template for seamless PDF parsing using pdf2json and FilePond. Ideal for developers seeking a ready-to-use solution for PDF content extraction in Next.js projects.
A collection of OpenClaw Agent Skills β search, analysis, content extraction, and more.
Local browser toolkit for AI agents: deep research and browser use automation with local Chrome (CDP) + Playwright. Flexible, extensible scripts for web navigation, extraction and workflow automatization - built for reproducible research and agent-driven browsing.
Agent Skills for integrating You.com capabilities into agentic workflows and AI development tools - guided integrations for Claude, OpenAI, Vercel AI SDK, and Teams.ai
DOM Based Content Extraction via Text Density
Pure Rust document-to-Markdown converter for LLM workflows (DOCX, PPTX, XLSX, HTML, CSV, JSON, XML, images).
Pure ruby implementation of the Boilerpipe content extraction algorithm tuned for online articles
Fast, accurate web content extraction in Rust. ML page-type classification, per-type extraction, confidence scoring. F1=0.966 on ScrapingHub (#1), F1=0.859 across 2,008 annotated pages (1,497 development + 511 held-out test
Web content extraction using machine learning
The AI research assistant that cites real sources honestly β and searches the web. Your AI research assistant that cites real sources and stays honest. Works with Claude, Cursor, any MCP client.
π Model Context Protocol (MCP) tool for parsing websites using the Jina.ai Reader
π mcp-web-scrape β Clean, cache-aware web content fetcher for AI agents. Fetch any URL β extract readable content β return Markdown/JSON with citations. β‘ Fast caching, π€ robots.txt compliant, π Markdown-ready output, οΏ½οΏ½ works with ChatGPT/Claude Desktop.
Add a description, image, and links to the content-extraction topic page so that developers can more easily learn about it.
To associate your repository with the content-extraction topic, visit your repo's landing page and select "manage topics."