关于flash_attn
GXKIM opened this issue · comments
GXKIM commented
wangding zeng commented
这个是transformers的检测机制问题,可以在源代码目录试试pip install .
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
GXKIM opened this issue · comments
这个是transformers的检测机制问题,可以在源代码目录试试pip install .