#feature request# rope_scalling supprot
Xingxiangrui opened this issue · comments
XXR commented
As We all know Mixtral already support rope_theata: https://arxiv.org/abs/2310.05209
However it does not supprot rope_scalling parameters..
Will Mixtral support rope_scalling param like LLaMA does ?
"rope_scalling":{
"factor" : 4.0,
"type": "linear"
},
or just set it to null for the current model.
"rope_scalling":null
If it support rope_scalling param, We can merge llama and mistral model into Mixtral-MoE without modify the source code of Mixtral: