Programming-from-A-to-Z / Example-RAG-Replicate

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Retrieval Augmented Generation

Overview

This is an example Node.js application that utilizes embeddings and the LLaMA model for text retrieval and response generation. It processes a text corpus, generates embeddings for "chunks", and uses these embeddings to performa a "similarity search" in response to queries. The system consists of a node.js server that handles API requests and a p5.js sketch for client interaction.

  • server.js: Server file that handles API requests and integrates with the Replicate API.
  • save-embeddings.js: Process a text file and generate embeddings.
  • test-embeddings.js: Test the embeddings search functionality without all that client server stuff.
  • embeddings.json: Precomputed embeddings generated from the text corpus.
  • public/: p5.js sketch
  • .env: API token

References

How-To

  1. Install Dependencies
npm install
  1. Set up the .env file with your Replicate API token:
REPLICATE_API_TOKEN=your_api_token_here
  1. Generate the embeddings.json file by running save-embeddings.js. (You'll need to hard-code a text filename and adjust how the text is split up depending on the format of your data.)
const raw = fs.readFileSync('text-corpus.txt', 'utf-8');
let chunks = raw.split(/\n+/);
node save-embeddings.js
  1. Run the Server
node server.js

Open browser to: http://localhost:3000 (or whatever port is specified.)

About


Languages

Language:JavaScript 97.6%Language:HTML 2.4%