OpenBMB / CPM-Bee

百亿参数的中英文双语基座大模型

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

连续推理出现 OOM

YingLaiLin opened this issue · comments

参照推理的 tutorial ,基于PIQA 数据集进行推理,连续执行 44 次后,出现 OOM:
代码参考;
for data in data_list:
inference_results = beam_search.generate([data], max_length=100, repetition_penalty=1.1)
for res in inference_results:
print(res)
=== ========报错信息==============
RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 39.45 GiB total capacity; 37.68 GiB already allocated; 20.25 MiB free; 38.26 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF