janakiramm / Intel-Xeon-LLM-RAG-Inference-Setup

This repository provides a comprehensive guide to setting up and running a LLM inference server optimized for Intel Xeon machines, with a focus on Retrieval Augmented Generation (RAG). The repository includes step-by-step instructions for configuring a Docker-based server environment and a Python client setup.

Geek Repo

Github PK Tool

janakiramm/Intel-Xeon-LLM-RAG-Inference-Setup Stargazers

sekahler2