GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
shuangshuangguo opened this issue a year ago · comments
请教一下模型的训练目标。根据论文中的图二,decoder的输出是乱序的[x5][x6][e][x3]吗?我觉得应该是正序的[x3][e][x5][x6]?