- Check if we can pretrain from scratchwith 1.58 Bit (random initialized) (we are here)
- Initialize 1.58 Bit from Mixtral/Mistral Weights (we are here)
- Continued pretraining
- Move to ASIC
- AGI (in 1.58 bit, on ASIC)
python3 -m venv venv
. ./venv/bin/activate
cd hqq && pip install -e .
Model |
Dataset |
Quant |
Groupsize |
PPL |
TheBloke/Llama-2-7B-fp16 |
wikitext + wikitext_wikitext-2-raw-v1, validation splits |
HQQ 1.58 |
16 |
400.46 |
TheBloke/Llama-2-7B-fp16 |
wikitext + wikitext_wikitext-2-raw-v1, validation splits |
HQQ 1.58 |
8 |
8.69 |
TheBloke/Llama-2-7B-fp16 |
wikitext + wikitext_wikitext-2-raw-v1, validation splits |
FP16 |
- |
5.18 |
TheBloke/Llama-2-13B-fp16 |
wikitext + wikitext_wikitext-2-raw-v1, validation splits |
HQQ 1.58 |
16 |
48.23 |
TheBloke/Llama-2-13B-fp16 |
wikitext + wikitext_wikitext-2-raw-v1, validation splits |
HQQ 1.58 |
8 |
7.2732 |