Why DeepSeek-Coder-v2 236B is not trained with FIM objective?
wasiahmad opened this issue · comments
Wasi Ahmad commented
The paper mentions that DeepSeek-Coder-v2 236B is trained by only utilizing the Next-Token-Prediction objective. No FIM objective is used. Is there any reason not to use FIM?
Daya Guo commented
The deepseek-coder-v2 236B model was not intended for code completion, so FIM (Fill-in-the-Middle) was not used.
Wasi Ahmad commented
Is it intended to use as instruction following LLM?
Daya Guo commented
yes.