Question: it seems that DeepSeek V3.1 is faster than DeepSeek V3(0324 version) when decoding under SGLang and H800.

Question

Question: it seems that DeepSeek V3.1 is faster than DeepSeek V3(0324 version) when decoding under SGLang and H800.

Huixxi opened this issue 18 days ago · comments

I don't know whether V3.1 is really faster than V3, but if it has something to do with its new UE8M0 and DeepGEMM is support this type of FP8 natively? Thanks.

Chenggang Zhao · Answer 1 · Thu Aug 28 2025 16:57:44 GMT+0800 (China Standard Time)

On the same device, UE8M0 and FP32 SF offer the same level of performance. I guess the reason is that V3.1's output length is shorter than V3 averagely.

HU · Answer 2 · Thu Aug 28 2025 17:11:41 GMT+0800 (China Standard Time)

Ok, thanks a lot.