karpathy / llama2.c

Inference Llama 2 in one file of pure C

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to quantize stories15M.bin

forcekeng opened this issue · comments

Hi, I want know how to quantize stories15M.bin or stories42M.bin. I try to use python export.py, it shows no params.json.

image

if you have a checkpoint file .pt

$ python export.py stories15M_q8.bin --version 2 --checkpoint out/ckpt.pt