调用glm模型，遇到modeling_glm.py的bug：attention_mask初始化device设置遗漏

Question

调用glm模型，遇到modeling_glm.py的bug：attention_mask初始化device设置遗漏

luo-li-ba-suo opened this issue 10 months ago · comments

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

原因是GLMModel类中
if attention_mask is None: attention_mask = torch.zeros(batch_size)
这里没有把attention_mask转到正确的device上

kideng · Answer 1 · Thu Jul 13 2023 12:31:37 GMT+0800 (China Standard Time)

额貌似是别的问题
这个问题不管好像没事