#feature request# rope_scalling supprot

Question

#feature request# rope_scalling supprot

Xingxiangrui opened this issue 6 months ago · comments

As We all know Mixtral already support rope_theata: https://arxiv.org/abs/2310.05209
However it does not supprot rope_scalling parameters..
Will Mixtral support rope_scalling param like LLaMA does ?

"rope_scalling":{
    "factor" : 4.0,
    "type": "linear"
},

or just set it to null for the current model.

"rope_scalling":null

If it support rope_scalling param, We can merge llama and mistral model into Mixtral-MoE without modify the source code of Mixtral:

arcee-ai/mergekit#88