Inferece code demo for WizardCoder

Question

Inferece code demo for WizardCoder

ganler opened this issue a year ago · comments

Hi, thanks for the amazing work. I am interested in evaluating WizardCoder-Python-34B-V1.0 on HumanEval+. Just curious if there is a minimal Python/HF code snippet demo for me to reference? Thanks!

ChiYeung Law · Answer 1 · Sun Aug 27 2023 14:30:48 GMT+0800 (China Standard Time)

Thanks for your great eval-plus project. We conducted an extra evaluation on the HE+. The pass@1 is 64.6 (greedy), higher than ChatGPT (63.4). You can use humaneval_gen_vllm.py to generate the code completions.

pip install vllm # This can acclerate the inference process a lot.
pip install transformers==4.31.0

model="/path/to/your/model"
temp=0.2 # set to 0.0 for greedy decoding
max_len=2048
pred_num=200 # set to 1 for greedy decoding
num_seqs_per_iter=1

output_path=preds/T${temp}_N${pred_num}

mkdir -p ${output_path}
echo 'Output path: '$output_path
echo 'Model to eval: '$model

CUDA_VISIBLE_DEVICES=0,1,2,3 python humaneval_gen_vllm.py --model ${model} \
  --start_index 0 --end_index 164 --temperature ${temp} \
  --num_seqs_per_iter ${num_seqs_per_iter} --N ${pred_num} --max_len ${max_len} --output_path ${output_path} --num_gpus 4

Jiawei Liu · Answer 2 · Sat Sep 02 2023 15:17:08 GMT+0800 (China Standard Time)

Great! I am able to obtained the raw output (in a dialog fashion). Curious if you can point me to the post-processing script to turn them into actual code? (I guess it is simply s.split("```python")[-1].split("```")[0]?)

ChiYeung Law · Answer 3 · Mon Sep 04 2023 10:04:45 GMT+0800 (China Standard Time)

yes. we use a similar method. https://github.com/nlpxucan/WizardLM/blob/main/WizardCoder/src/process_humaneval.py

Jiawei Liu · Answer 4 · Mon Sep 04 2023 23:21:39 GMT+0800 (China Standard Time)

Perfect, we now obtain the results which looks strong and they are updated at https://evalplus.github.io/leaderboard.html

Thanks for the great work!