feizc / MLE-LLaMA

Multi-language Enhanced LLaMA

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

how to train for 30b or 65b on multiple GPU(4x80G)

lishangjin opened this issue · comments

https://zhuanlan.zhihu.com/p/616853024 提及:依赖的lora中引入的transformers支持多GPU存在问题。可以使用https://github.com/kooshi/alpaca-lora/tree/llama-parallelism llama-parallelism分支替换