- Download a AWQ model from HuggingFace and place it in /models
- Update the
<model_name>
in predict.py to match the file in /models - Create a model on Replicate (https://replicate.com/docs/guides/push-a-transformers-model)
- Run
cog login
- Run
cog push r8.im/<your-username>/<your-model-name>
docker system prune
to cleanup temp images
Thanks to nateraw on https://github.com/nateraw/replicate-examples