Inferential

Inferential is a web application that provides a playground and API for running large language models.

QuickStart

Use the following to clone the git repository, install dependencies, and run the application:

git clone git@github.com:jncraton/inferential.git
cd inferential
pip install -r requirements.txt
make run

Here's a Python example communicating with the API:

>>> import requests
>>> requests.get("http://localhost:5000/api?input=Where is Paris").text
'Paris is located in France.'

Production Deployment

While using the built in server is useful for development, it is not ideal for a production environment. You will need to deploy the application to a server that is safer for production use. One way you can deploy is using the gunicorn package. Gunicorn only works on Unix based machines so if you use this method be sure to use a Unix machine. To start the process, clone the git repository onto your Unix machine.

Clone the repository

git clone git@github.com:jncraton/inferential.git

Then you will need to create a new virtual environment. Navigate to the project folder and enter the following command.

python3 -m venv venv

Then activate the environment

source venv/bin/activate

Once you have activated your environment, you need to install all the requirements if you haven't already installed them.

pip install -r requirements.txt
pip install gunicorn

Then all that is left is to start the server using gunicorn using the following command

gunicorn -w 4 -b 0.0.0.0:8000 "inferential:create_app()"

Once you have run that command you should be able to connect to your server on 0.0.0.0:8000 For further help, refer to the Gunicorn documentation. (https://gunicorn.org/)

About

An open inference server for educational use

MIT License

Languages

Language:Python 65.0%Language:HTML 13.0%Language:JavaScript 10.8%Language:CSS 10.2%Language:Makefile 1.0%