A fully automated deep research MCP (Model Context Protocol) server that provides comprehensive topic analysis using multiple search engines and AI-powered synthesis. Built with MCP 2025-06-18 specification compliance.
Topic Deep Diver rivals commercial solutions like Perplexity's Deep Research while maintaining full automation and extensive online search capabilities. The system performs autonomous multi-step research, evaluates sources, and synthesizes findings into comprehensive reportsβall without user intervention.
- Zero user intervention during research process
- Autonomous decision-making for search strategies
- Intelligent stopping criteria based on information saturation
- Complete end-to-end research pipeline
- Multi-Tier Search Strategy: Web, academic, and specialized databases
- 15+ Search Engines: SearXNG, Google Scholar, PubMed, arXiv, and more
- Real-Time Content Extraction: HTML to markdown conversion with metadata
- Source Diversity Optimization: Ensures balanced perspective coverage
- Query Decomposition: Breaks complex topics into structured sub-questions
- Source Credibility Scoring: 0-100 scale with bias detection
- Information Synthesis: Multi-source aggregation with citation tracking
- Gap Identification: Automatically identifies and fills knowledge gaps
- Structured Tool Output: JSON-formatted research reports
- OAuth Resource Server: Secure authentication with Resource Indicators (RFC 8707)
- Resource Links: Efficient handling of large research artifacts
- Enhanced Security: Follows latest MCP security best practices
The server exposes three core tools for MCP clients (Claude, OpenCode, Cline):
Main research orchestrator that performs autonomous deep research on any topic.
Parameters:
topic(required): The research topic or questionscope(optional): Research depth - "quick", "comprehensive", or "academic"
Returns: Structured research report with findings, sources, and citations
Monitor the progress of ongoing research sessions.
Parameters:
session_id(required): Unique identifier for the research session
Returns: Real-time progress updates and current research stage
Export research findings in various formats.
Parameters:
session_id(required): Research session to exportformat(optional): Output format - "markdown", "pdf", "json", "html"
Returns: Resource link to the exported research report
User Query β Research Planner β Multi-Search Engine β Content Processor β Knowledge Synthesizer β Structured Report
- Analyzes topic complexity and scope
- Generates comprehensive search strategy
- Creates research taxonomy and keywords
- Determines stopping criteria automatically
- Parallel execution across multiple search engines
- Dynamic search refinement based on results
- Source diversity optimization
- Real-time result quality assessment
- Automatic source credibility scoring
- Content deduplication and clustering
- Bias detection and perspective analysis
- Information freshness validation
- Cross-source fact verification
- Narrative structure generation
- Citation tracking and management
- Gap identification and resolution
Primary Search Engines (General Web):
- SearXNG instances (privacy-focused)
- Brave Search API (privacy-respecting)
- Google/Bing Search APIs (comprehensive coverage)
Academic Search Engines:
- Google Scholar API (200M+ scholarly articles)
- PubMed API (30M+ medical citations)
- arXiv API (STEM preprints)
- Crossref API (scholarly metadata)
Content Extraction Tools:
- Fetch MCP server integration
- Firecrawl for JavaScript-heavy sites
- Jina Reader for clean text extraction
Query Decomposition Algorithm:
- NLP-based key concept extraction
- Question type identification (factual, analytical, comparative)
- Research taxonomy generation
- Sub-question prioritization by importance
Source Credibility Scoring:
- Domain authority assessment
- Publication recency and relevance
- Author expertise verification
- Citation count and impact analysis
- Cross-reference validation
Completion Criteria:
- Information saturation detection
- Confidence threshold achievement (β₯85% coverage)
- Maximum time/resource limits
- All major perspectives captured
- Python 3.11+
- MCP 2025-06-18 SDK
- Redis (for caching)
- External API keys (optional, for enhanced search)
- Clone the repository:
git clone https://github.com/RouHim/topic-deep-diver.git
cd topic-deep-diver- Install dependencies:
pip install -r requirements.txt- Configure the server:
cp config/config.example.yaml config/config.yaml
# Edit configuration with your API keys and preferences- Start the MCP server:
python -m topic_deep_diverAdd to your Claude Desktop configuration:
{
"mcpServers": {
"topic-deep-diver": {
"command": "python",
"args": ["-m", "topic_deep_diver"],
"env": {
"CONFIG_PATH": "/path/to/config.yaml"
}
}
}
}topic-deep-diver/
βββ topic_deep_diver/ # Main package
β βββ server/ # MCP server implementation
β βββ research/ # Research pipeline components
β βββ search/ # Search engine integrations
β βββ analysis/ # Source analysis and scoring
β βββ synthesis/ # Information synthesis
β βββ utils/ # Utilities and helpers
βββ tests/ # Test suite
βββ docs/ # Documentation
βββ config/ # Configuration files
βββ AGENTS.md # Project knowledge base
- Create virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install development dependencies:
pip install -r requirements-dev.txt- Run tests:
pytest tests/ -v- Run with debugging:
python -m topic_deep_diver --debug- MCP Server Setup with 2025-06-18 specification
- Core MCP Tools Implementation
- Basic Search Integration
- Query Processing Engine
- Source Analysis Engine
- Information Synthesis Engine
- Academic Search Integration
- Real-time Processing
- Quality Assurance Systems
- Security Implementation (OAuth, RFC 8707)
- Performance Optimization
- Testing & Documentation
- Research Completion Time: <4 minutes for comprehensive topics
- Source Diversity: Minimum 10 unique, credible sources per report
- Accuracy Rate: >90% fact verification across sources
- Information Coverage: >85% coverage of major topic aspects
We welcome contributions! Please see our Contributing Guidelines for details.
- Check the Issues for current tasks
- Read AGENTS.md for project knowledge and context
- Follow the established architecture and coding standards
- Ensure all tests pass before submitting PRs
This project is licensed under the MIT License - see the LICENSE file for details.
- Model Context Protocol for the foundational specification
- Perplexity AI for inspiration on deep research capabilities
- The open-source MCP community for tools and integrations
- π Documentation
- π Issues
- π¬ Discussions
- π§ Contact: Create an issue
Status: π§ In Development | Version: 0.1.0-alpha | MCP Spec: 2025-06-18