RAD-Ninjas / llm-on-prem

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LLM-On-Prem


Introduction

The frontend is a boilerplate app generated by npx create-secure-chatgpt-app. It leverages several of Pangea's security services for securing ChatGPT usage.

Features

  • Frontend:

    • Authentication using Pangea AuthN service and NextJS framework
    • Secure chat page
    • Secure chat/generate api endpoint
    • Redacting user prompts using Pangea's Redact service
    • Auditing user prompts using Pangea's Secure Audit Log service
    • De-fanging malicious responses from OpenAI API using Pangea's Domain Intel and URL Intel services
  • Backend:

    • lm-sys FastChat server configured to operate as a stand-in for the OpenAI API
    • VLLM inference engine worker
    • Supports multiple HF transformer models

Installation

Once you've cloned the repo, perform the following steps to get the app up and running:

  1. Set-up your local environment file

    cp .env.example .env
    
  2. Update relevant keys and variables with those of your own


  • Note: If running docker desktop on Windows or Mac you will likely need to increase your docker memory allocation to at least 16GB and max out your CPU allocation

  1. Build and Run the containers:

    • CPU:
      • Build docker compose --profile cpu build
      • Run docker compose --profile cpu up

    - **CUDA:** - Build `docker compose --profile cuda build` - Run `docker compose --profile cuda up`
    - **OpenVINO:** - Build `docker compose --profile openvino build` - Run `docker compose --profile openvino up`
    - **Metal:** - Build `docker compose --profile metal build` - Run `docker compose --profile metal up`
    - **GGML:** - Build `docker compose --profile ggml build` - Run `docker compose --profile ggml up`

About


Languages

Language:TypeScript 79.3%Language:Dockerfile 10.6%Language:Shell 6.9%Language:JavaScript 2.6%Language:Python 0.4%Language:CSS 0.1%