YanCotta / AdvancedRAG

AdvancedRAG is a repository showcasing advanced Retrieval Augmented Generation (RAG) techniques.

Repository from Github https://github.comYanCotta/AdvancedRAGRepository from Github https://github.comYanCotta/AdvancedRAG

πŸš€ AdvancedRAG

License: MIT Python 3.8+ Documentation

A state-of-the-art implementation of Retrieval Augmented Generation with advanced techniques

Features β€’ Getting Started β€’ Documentation β€’ Contributing


🎯 Overview

A sophisticated implementation of advanced Retrieval Augmented Generation (RAG) techniques, featuring multi-strategy retrieval, automated evaluation, and modular architecture.

✨ Key Features

πŸ” Multi-Strategy Retrieval Pipeline

  • AutoMerging Retrieval with hierarchical node parsing
  • Sentence Window Retrieval for granular context
  • Cross-encoder reranking for enhanced relevance
  • Multi-hop reasoning capabilities

πŸ“Š Advanced Evaluation Framework

  • Integrated TruLens evaluation
  • Confidence scoring and analysis
  • Automated groundedness assessment
  • Performance metrics dashboard

πŸ“ File Structure

AdvancedRAG/
β”œβ”€β”€ AutoMergingRetrieval/
β”‚   β”œβ”€β”€ utils.py              # Core utilities
β”‚   └── AutoMergingRetrieval.py
β”œβ”€β”€ AdvancedRAGPipeline/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ utils.py         # Pipeline utilities
β”‚   β”‚   └── pipeline.py      # RAG orchestration
β”‚   └── data/                # Evaluation sets
└── data/                    # Shared resources

πŸ› οΈ Techniques and Methodologies

AutoMerging Retrieval

Utilizes hierarchical node parsing to merge document nodes across varying levels of granularity, resulting in more contextualized retrieval.

Sentence Window Retrieval

Extracts text in overlapping windows to capture granular context, enhancing retrieval precision.

TruLens Evaluation

Integrates feedback mechanisms that measure answer relevance and groundedness, ensuring high-quality responses.

πŸ”§ Implementation Details

AutoMergingRetrieval
  • Implements dynamic node size adjustment
  • Uses similarity-based merging strategies
  • Supports customizable merging thresholds
Advanced RAG Pipeline
  • Integrates multiple retrieval strategies
  • Features automated evaluation loops
  • Provides detailed performance metrics

🚦 Setup & Usage

Prerequisites

  • Python 3.8+
  • OpenAI API key
  • HuggingFace API key

Installation

  1. Clone the repository:

    git clone https://github.com/YanCotta/AdvancedRAG.git
    cd AdvancedRAG
  2. Install dependencies:

    pip install -r requirements.txt
  3. Configure environment: Create .env in project root:

    OPENAI_API_KEY=your_openai_api_key
    HUGGINGFACE_API_KEY=your_huggingface_api_key
  4. Run the pipelines:

    # For basic and auto-merging retrieval
    python src/run_retrieval.py
    
    # For full RAG pipeline with evaluations
    python AdvancedRAGPipeline/src/run_pipeline.py

πŸ“ Contributing & License

We welcome contributions! See our Contributing Guidelines for details.

Licensed under the MIT License.


Built with ❀️ by the AdvancedRAG Team

Report Bug β€’ Request Feature

About

AdvancedRAG is a repository showcasing advanced Retrieval Augmented Generation (RAG) techniques.

License:MIT License


Languages

Language:Python 100.0%