lm-evaluation-harness-webui-wrapper
Have you ever pondered how quantization might affect model performance, or what the trade-off is between quantized methods? Me too. Let's explore together!
This wrapper leverages both oobabooga text-generation-webui and EleutherAI lm-evaluation-harness to evaluate GPTQ version models on various benchmarks including ARC, HellaSwag, MMLU, and TruthfulQA, akin to the Open LLM Leaderboard
Test Results
Original Model Quantized Model ARC: 25-shot, arc-challenge (acc_norm) matching EleutherAI lm-evaluation-harness
Model Name | Bits | GS | ARC | Act Order |
---|---|---|---|---|
OpenOrca Platypus2 13B | 16-bit | NA | 62.88% | NA |
OpenOrca Platypus2 13B | 8-bit | None | 62.88% | Yes |
OpenOrca Platypus2 13B | 4-bit | 32 | 62.28% | Yes |
OpenOrca Platypus2 13B | 4-bit | 128 | 62.62% | No |
Note the 4bit 32GS model reports lower acc_norm then the 4bit 128GS but higher acc of 58.02% vs 57.59%
Supported OS
Linux & Windows
Linux Instructions
- Follow the installation procedure for [oobabooga text-generation-webui](https://github.com/oobabooga text-generation-webui).
- Download a model and run it in the webui to ensure it is working. Then, shutdown the webui.
- Update
activate_env.sh
with the path to the webui. - Open a terminal in this directory and run:
chmod +x activate_env.sh
source activate_env.sh
- Proceed with the installation of lm-evaluation-harness
git clone https://github.com/EleutherAI/lm-evaluation-harness
cd lm-evaluation-harness
pip install -e .
- Update the model.json file model path
chmod +x run_script.sh
./run_script.sh
Windows Instructions
- Follow the installation procedure for [oobabooga text-generation-webui](https://github.com/oobabooga text-generation-webui).
- Download a model and run it in the webui to ensure it is working. Then, shutdown the webui.
- Update
activate_env.bat
with the path to the webui. - Run the 'activate_env.bat'
- Proceed with the installation of lm-evaluation-harness big refactor branch
git clone https://github.com/EleutherAI/lm-evaluation-harness
cd lm-evaluation-harness
git checkout big-refactor
pip install -e .
- Update the model.json file model path
./run_script.bat
TODO
- Make the process more user-friendly.
- Add a file for defining variables.
- Add results with 3bit models