[WIP] OpenAI.mini
This repo implements OpenAI APIs with open source models, for example, open source LLMs for chat, Whisper
for audio, SDXL
for image and so on. With this repo, you can interact with LLMs using the openai
libraries or the LangChain
library.
Frontend
How to use
1. Install dependencies
pip3 install -r requirements.txt
pip3 install -r app/requirements.txt
If you have make
, you can also run with
make install
2. Get frontend
You may build the frontend with yarn
yourself, or just download the built package from the release.
Option 1: install the dependencies and build frontend
cd app/frontend
yarn install
yarn build
Option 2: download it from release
-
Download the
dist.tar.gz
from release page. -
Extract the
dist
directory and put it in theapp/frontend
directoryPlease make sure that the directory layout should be like this:
┌─hjmao in ~/workspace/openai.mini/app/frontend/dist └─± ls asset-manifest.json assets favicon.ico index.html manifest.json robots.txt static
3. Set the environment variables, and modify it.
cp .env.example .env
# Modify the `.env` file
4. Download the model weight manually (Optional)
If you have already downloaded the weight files, or you want to manage the model weights in some place, you can specify a MODEL_HUB_PATH
in the .env
and put the weight files in it. MODEL_HUB_PATH
is set to hub
by default.
OpenAI.mini will first find the model weight in MODEL_HUB_PATH
, if it does not exist in it, it will automatically download the weight files from Huggingface by the model name. The MODEL_HUB_PATH
directory will be like this
MODEL_HUB_PATH directory layout
┌─hjmao at 573761 in ~/workspace/openai.mini/hub
└─○ tree -L 2
.
├── baichuan-inc
│ ├── Baichuan-13B-Base
│ ├── Baichuan-13B-Chat
│ └── Baichuan2-13B-Chat
├── intfloat
│ ├── e5-large-v2
│ └── multilingual-e5-large
├── meta-llama
│ ├── Llama-2-13b-chat-hf
│ └── Llama-2-7b-chat-hf
├── openai
│ ├── base.pt
│ ├── large-v2.pt
│ ├── medium.en.pt
│ ├── medium.pt
│ ├── small.pt
│ └── tiny.pt
├── Qwen
│ └── Qwen-7B-Chat
├── stabilityai
│ ├── FreeWilly2
│ ├── stable-diffusion-xl-base-0.9
│ └── stable-diffusion-xl-base-1.0
├── thenlper
│ └── gte-large
└── THUDM
├── chatglm2-6b
├── chatglm3-6b
└── codegeex2-6b
Notice: the models can be loadded on startup or on the fly.
5. Start server with OpenAI.mini
python3 -m src.api
python3 -m app.server
6. Access the OpenAI.mini services
OpenAI.mini have implemented most APIs of the OpenAI platform and also a ChatGPT-like web frontend. You may access the OpenAI.mini services with the openai libraries or chat with the models in the web frontend.
- Access as a openai service: You can use openai packages or the Langchain library to access it by setting the
openai.api_base="YOUR_OWN_IP:8000/api/v1"
and `openai.api_key="none_or_any_other_string". Find more detail examples here. - Access as a ChatGPT: You can open it with your web browser with
http://YOUR_OWN_IP:8001/index.html?model=MODEL_NAME
.
OpenAI API Status
Services | API | Status | Description |
---|---|---|---|
Authorization | |||
Models | List models | ✅ Done | |
Models | Retrieve model | ✅ Done | |
Chat | Create chat completion | Partial Done | Support Multi. LLMs |
Completions | Create completion | ||
Images | Create image | ✅ Done | |
Images | Create image edit | ||
Images | Create image variation | ||
Embeddings | Create embeddings | ✅ Done | Support Multi. LLMs |
Audio | Create transcription | ✅ Done | |
Audio | Create translation | ✅ Done | |
Files | List files | ✅ Done | |
Files | Upload file | ✅ Done | |
Files | Delete file | ✅ Done | |
Files | Retrieve file | ✅ Done | |
Files | Retrieve file content | ✅ Done | |
Fine-tuning | Create fine-tuning job | ||
Fine-tuning | Retrieve fine-tuning job | ||
Fine-tuning | Cancel fine-tuning | ||
Fine-tuning | List fine-tuning events | ||
Moderations | Create moderation | ||
Edits | Create edit |
Supported Language Models
Supported Embedding Models
Model | Embedding Dim. | Sequnce Length | Checkpoint link |
---|---|---|---|
bge-large-zh | 1024 | ? | BAAI/bge-large-zh |
m3e-large | 1024 | ? | moka-ai/m3e-large |
text2vec-large-chinese | 1024 | ? | GanymedeNil/text2vec-large-chinese |
gte-large | 1024 | 512 | thenlper/gte-large |
e5-large-v2 | 1024 | 512 | intfloat/e5-large-v2 |
Supported Diffusion Modles
Model | #Resp Format | Checkpoint link |
---|---|---|
stable-diffusion-xl-base-1.0 | b64_json, url | stabilityai/stable-diffusion-xl-base-1.0 |
stable-diffusion-xl-base-0.9 | b64_json, url | stabilityai/stable-diffusion-xl-base-0.9 |
Supported Audio Models
Model | #Params | Checkpoint link |
---|---|---|
whisper-1 | 1550 | alias for whisper-large-v2 |
whisper-large-v2 | 1550 M | large-v2 |
whisper-medium | 769 M | medium |
whisper-small | 244 M | small |
whisper-base | 74 M | base |
whisper-tiny | 39 M | tiny |
Example Code
Stream Chat
import openai
openai.api_base = "http://localhost:8000/api/v1"
openai.api_key = "none"
for chunk in openai.ChatCompletion.create(
model="Baichuan2-13B-Chat",
messages=[{"role": "user", "content": "Which moutain is the second highest one in the world?"}],
stream=True
):
if hasattr(chunk.choices[0].delta, "content"):
print(chunk.choices[0].delta.content, end="", flush=True)
Chat
import openai
openai.api_base = "http://localhost:8000/api/v1"
openai.api_key = "none"
resp = openai.ChatCompletion.create(
model="Baichuan2-13B-Chat",
messages = [{ "role":"user", "content": "Which moutain is the second highest one in the world?" }]
)
print(resp.choices[0].message.content)
Create Embeddings
import openai
openai.api_base = "http://localhost:8000/api/v1"
openai.api_key = "none"
embeddings = openai.Embedding.create(
model="gte-large",
input="The food was delicious and the waiter..."
)
print(embeddings)
List LLM Models
import os
import openai
openai.api_base = "http://localhost:8000/api/v1"
openai.api_key = "none"
openai.Model.list()
Create Image
import os
import openai
from base64 import b64decode
from IPython.display import Image
openai.api_base = "http://localhost:8000/api/v1"
openai.api_key = "none"
response = openai.Image.create(
prompt="An astronaut riding a green horse",
n=1,
size="1024x1024",
response_format='b64_json'
)
b64_json = response['data'][0]['b64_json']
image = b64decode(b64_json)
Image(image)
Create Transcription
# Cell 1: set openai
import openai
openai.api_base = "http://localhost:8000/api/v1"
openai.api_key = "None"
# Cell 2: create a recorder in notebook
# ===================================================
# sudo apt install ffmpeg
# pip install torchaudio ipywebrtc notebook
# jupyter nbextension enable --py widgetsnbextension
from IPython.display import Audio
from ipywebrtc import AudioRecorder, CameraStream
camera = CameraStream(constraints={'audio': True,'video':False})
recorder = AudioRecorder(stream=camera)
recorder
# Cell 3: transcribe
import os
import openai
temp_file = '/tmp/recording.webm'
with open(temp_file, 'wb') as f:
f.write(recorder.audio.value)
audio_file = open(temp_file, "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)
print(transcript.text)
Acknowledgement
项目参考了很多大佬的代码,例如 @xusenlinzy 大佬的api-for-open-llm, @hiyouga 大佬的LLaMA-Efficient-Tuning 等,表示感谢。