A state-of-the-art implementation of Retrieval Augmented Generation with advanced techniques
Features β’ Getting Started β’ Documentation β’ Contributing
A sophisticated implementation of advanced Retrieval Augmented Generation (RAG) techniques, featuring multi-strategy retrieval, automated evaluation, and modular architecture.
|
|
AdvancedRAG/
βββ AutoMergingRetrieval/
β βββ utils.py # Core utilities
β βββ AutoMergingRetrieval.py
βββ AdvancedRAGPipeline/
β βββ src/
β β βββ utils.py # Pipeline utilities
β β βββ pipeline.py # RAG orchestration
β βββ data/ # Evaluation sets
βββ data/ # Shared resources
Utilizes hierarchical node parsing to merge document nodes across varying levels of granularity, resulting in more contextualized retrieval.
Extracts text in overlapping windows to capture granular context, enhancing retrieval precision.
Integrates feedback mechanisms that measure answer relevance and groundedness, ensuring high-quality responses.
AutoMergingRetrieval
- Implements dynamic node size adjustment
- Uses similarity-based merging strategies
- Supports customizable merging thresholds
Advanced RAG Pipeline
- Integrates multiple retrieval strategies
- Features automated evaluation loops
- Provides detailed performance metrics
- Python 3.8+
- OpenAI API key
- HuggingFace API key
-
Clone the repository:
git clone https://github.com/YanCotta/AdvancedRAG.git cd AdvancedRAG -
Install dependencies:
pip install -r requirements.txt
-
Configure environment: Create
.envin project root:OPENAI_API_KEY=your_openai_api_key HUGGINGFACE_API_KEY=your_huggingface_api_key
-
Run the pipelines:
# For basic and auto-merging retrieval python src/run_retrieval.py # For full RAG pipeline with evaluations python AdvancedRAGPipeline/src/run_pipeline.py
We welcome contributions! See our Contributing Guidelines for details.
Licensed under the MIT License.
Built with β€οΈ by the AdvancedRAG Team