train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
sysuprophet opened this issue 10 months ago · comments
你好,使用covert2ckpt.py转换后的模型embedding size会加大1是吗?需要修改对应config.json中的vocab_size?
@HuangLK