THUDM / GLM

GLM (General Language Model)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

glm-10b / tokenization_glm.py

chenhaoenen opened this issue · comments

for choice_str in choices:
            choice = torch.tensor(self(choice_str, add_special_tokens=False, padding=False)['input_ids'],
                                  dtype=torch.long)
            choice_ids.append(choice)
            choice_indices.append(torch.arange(len(token), len(token) + len(choice), dtype=torch.long))
            attention_mask.append(torch.tril(torch.ones((len(choice), len(choice)), dtype=torch.long)))

            token = torch.cat((token, torch.tensor([self.sop_token_id], dtype=torch.long), choice[:-1]))
            position_id = torch.cat((position_id, torch.tensor([mask_position] * len(choice), dtype=torch.long)))
            block_position_id = torch.cat((block_position_id, torch.arange(1, 1 + len(choice), dtype=torch.long)))

choice[:-1] 切片错误