Harness does not work properly
RobinJing opened this issue · comments
Describe the bug
Wanna use Harness but does not work.
How to reproduce
Steps to reproduce the error:
- Using Conda:
- conda create with python 3.11
- conda activate& git clone harnesss
- pip install -e .

2.Using Docker b11: - pip install -e . successfully
- python run_multi_llb.py --model ipex-llm --pretrained /model/DeepSeek-R1-Distill-Qwen-32B --precision sym_int4 --device xpu:0,1,2,3 --tasks mmlu --batch 1 --no_cache

- pip install datasets==2.21.0
- python run_multi_llb.py --model ipex-llm --pretrained /model/DeepSeek-R1-Distill-Qwen-32B --precision sym_int4 --device xpu:0,1,2,3 --tasks mmlu --batch 1 --no_cache

- pip install accelerate==0.26.0 & python run_multi_llb.py again

-pip install trl==0.11.0 & python run_multi_llb.py again
For the conda environment, I have setup the environment and the deployment is ready to go, after this , I can start the script with:
python run_multi_llb.py --model ipex-llm --pretrained /model/DeepSeek-R1-Distill-Qwen-32B --precision sym_int4 --device xpu:0,1,2,3 --tasks hellaswag --batch 1 --no_cache
but the program stops suddenly without any meaningful output:
Maybe you can try to run it again using latest b16 image, and we don't need to install conda again in docker container, you can use pip install directly to test.
If you want to run multi-card for one single large model, refer to this guide: https://github.com/intel/ipex-llm/tree/main/python/llm/dev/benchmark/ceval#multi-gpu-environment
run_multi_llb.py is supposed to run multiple tasks across multi cards and one model & task on a single card.
Hi, I use b16 image docker to perform the test, simply I start the backend server from vllm, and use harness 'local-completions' as the frontend, when I run the winogrande task, it works fine; however if I run mmlu task, the vllm server crashes:
And this error looks like an OOM error, can you provide the entile steps on how to run vllm and how to reproduce this issue?
Synced offline, may check the memory usage of each card to confirm if the issue is OOM and paste your environment/steps as Jian suggested.


