liucongg / ChatGLM-Finetuning

基于ChatGLM-6B、ChatGLM2-6B、ChatGLM3-6B模型,进行下游具体任务微调,涉及Freeze、Lora、P-tuning、全参微调等

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

运行trainer时报错Error building extension 'fused_adam'

J-G-Y opened this issue · comments

如题

前置报错Unsupported gpu architecture 'compute_80'

with torch.cuda.amp.autocast(enabled=True, dtype=torch.bfloat16) as autocast, torch.backends.cuda.sdp_kernel(enable_flash=False) as disable:
outputs = model(**batch, use_cache=False)
loss = outputs.loss
tr_loss += loss.item()
model.backward(loss)
torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
model.step()

with torch.cuda.amp.autocast(enabled=True, dtype=torch.bfloat16) as autocast, torch.backends.cuda.sdp_kernel(enable_flash=False) as disable: outputs = model(**batch, use_cache=False) loss = outputs.loss tr_loss += loss.item() model.backward(loss) torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0) model.step()

请问这段代码加在哪里?我也和楼主一样的错误,是在:
model, optimizer, _, lr_scheduler = deepspeed.initialize(model=model, args=args, config=ds_config,
dist_init_required=True)
这里报的错误~

听该是cuda版本的问题,cuda版本和装的要保持一致