WizardLM/WizardLM-13B-V1.1 performance

Question

WizardLM/WizardLM-13B-V1.1 performance

vincentyao2016 opened this issue a year ago · comments

Hi there

I would like to express my gratitude for sharing the WizardLM model. I am currently utilizing the WizardLM/WizardLM-13B-V1.1 model on an autodl machine. However, I have noticed that each time I execute the model.generate function to obtain a result, it takes approximately 3-4 minutes to complete. During this process, the CPU usage reaches 100%, the GPU utilization is around 24%, and the GPU memory consumption amounts to approximately 15GB. I have followed the example provided in the repository. Could you please confirm if this is the expected performance? Additionally, I would appreciate any suggestions on how to enhance the efficiency of the model.

PyTorch 2.0.0
Python 3.8(ubuntu20.04)
Cuda 11.8
GPU RTX A5000(24GB)
CPU15 vCPU AMD EPYC 7371 16-Core Processor
raw 28GB