InternLM / InternLM

描述问题

在MindFormers的介绍中看到，目前支持internlm_7b, internlm_20b, internlm_7b_lora在MindSpore上部署，请问internlm2是否也可以在MindSpore上部署？

internlm2 目前我们没完全验证尝试过，可能有一定的适配工作量，我们去了解一下哈

想问下，internlm2模型的 attention q k v 权重是合并在一起了吗，类似baichuan 模型那样吗model.layers.0.self_attn.W_pack.weight":

想问下，internlm2模型的 attention q k v 权重是合并在一起了吗，类似baichuan 模型那样吗model.layers.0.self_attn.W_pack.weight":

wqkv是合到一起的，可以看这里的说明：

Lines 13 to 15 in 67c5e9d

    
           ### Note 
        
           While the `convert2llama.py` tool is available, we still advise opting for InternLM2 when practical, chiefly due to its superior efficiency. InternLM2, which is adapted from LLaMA, streamlines the process by integrating the `Wq`, `Wk`, `Wv` weight matrices into a single matrix `Wqkv`. This integration leads to approximately a **5%** speed increase during training. Given the substantial costs associated with pre-training, this efficiency boost can result in significant savings.

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 7 days if the stale label is not removed or if there is no further response.

This issue is closed because it has been stale for 7 days. Please open a new issue if you have similar issues or you have any new updates now.

	### Note

	While the `convert2llama.py` tool is available, we still advise opting for InternLM2 when practical, chiefly due to its superior efficiency. InternLM2, which is adapted from LLaMA, streamlines the process by integrating the `Wq`, `Wk`, `Wv` weight matrices into a single matrix `Wqkv`. This integration leads to approximately a 5% speed increase during training. Given the substantial costs associated with pre-training, this efficiency boost can result in significant savings.

请问是否支持在MindSpore的910A或者910B上部署？

描述问题