InternLM / InternLM

Official release of InternLM2 7B and 20B base and chat models. 200K context support

Home Page:https://internlm.intern-ai.org.cn/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

请问是否支持在MindSpore的910A或者910B上部署?

ChingKwanCheung opened this issue · comments

描述问题

在MindFormers的介绍中看到,目前支持internlm_7b, internlm_20b, internlm_7b_lora在MindSpore上部署,请问internlm2是否也可以在MindSpore上部署?

internlm2 目前我们没完全验证尝试过,可能有一定的适配工作量,我们去了解一下哈

想问下,internlm2模型的 attention q k v 权重是合并在一起了吗,类似baichuan 模型那样吗model.layers.0.self_attn.W_pack.weight":

想问下,internlm2模型的 attention q k v 权重是合并在一起了吗,类似baichuan 模型那样吗model.layers.0.self_attn.W_pack.weight":

wqkv是合到一起的,可以看这里的说明:

InternLM/tools/README.md

Lines 13 to 15 in 67c5e9d

### Note
While the `convert2llama.py` tool is available, we still advise opting for InternLM2 when practical, chiefly due to its superior efficiency. InternLM2, which is adapted from LLaMA, streamlines the process by integrating the `Wq`, `Wk`, `Wv` weight matrices into a single matrix `Wqkv`. This integration leads to approximately a **5%** speed increase during training. Given the substantial costs associated with pre-training, this efficiency boost can result in significant savings.

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 7 days if the stale label is not removed or if there is no further response.

This issue is closed because it has been stale for 7 days. Please open a new issue if you have similar issues or you have any new updates now.