mit-han-lab / TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library

Home Page:https://mit-han-lab.github.io/TinyChatEngine/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

containerized as a Dockerfile

bhpayne opened this issue · comments

commented

If there's interest in having a containerized version, here's a Dockerfile (or, for Podman, a Containerfile) that works for me:

# https://hub.docker.com/r/phusion/baseimage/tags
FROM phusion/baseimage:18.04-1.0.0

RUN apt-get update && \
    apt-get install -y \
               python3 \
               python3-dev \
               python3-pip \
               git

WORKDIR /opt
#RUN git clone --recursive https://github.com/mit-han-lab/TinyChatEngine
# or, if you alread did that,
RUN mkdir TinyChatEngine
WORKDIR /opt/TinyChatEngine
COPY TinyChatEngine .

WORKDIR /opt/TinyChatEngine
RUN python3 -m pip install -r requirements.txt

WORKDIR /opt/TinyChatEngine/llm

# the following SED commands disable the deletion of the downloaded model .zip file
#RUN sed -i -e '199 i \ \ \ \ \ \ \ \ pass' tools/download_model.py
#RUN sed -i -e '200,201d' tools/download_model.py

# the list of available models is on https://github.com/mit-han-lab/TinyChatEngine/tree/main?tab=readme-ov-file#download-and-deploy-models-from-our-model-zoo
#RUN python3 tools/download_model.py --model CodeLLaMA_13B_Instruct_awq_int4 --QM QM_x86
RUN python3 tools/download_model.py --model LLaMA2_7B_chat_awq_int4 --QM QM_x86

RUN make chat -j

# ./chat

and here's a Makefile

mytag=tinychatengine-llama2_7b_chat_awq_int4
#mytag=tinychatengine-codellama_13b_instruct_awq_int4

docker: docker_build docker_run
docker_build:
	docker build -t $(mytag) .

docker_run:
	docker run -it --rm -v `pwd`:/scratch $(mytag) /bin/bash

download_repo:
	git clone --recursive https://github.com/mit-han-lab/TinyChatEngine