THUDM / GLM-130B

GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

训练目标

shuangshuangguo opened this issue · comments

请教一下模型的训练目标。根据论文中的图二,decoder的输出是乱序的[x5][x6][e][x3]吗?我觉得应该是正序的[x3][e][x5][x6]?
image