Question: it seems that DeepSeek V3.1 is faster than DeepSeek V3(0324 version) when decoding under SGLang and H800.
Huixxi opened this issue · comments
I don't know whether V3.1 is really faster than V3, but if it has something to do with its new UE8M0 and DeepGEMM is support this type of FP8 natively? Thanks.
On the same device, UE8M0 and FP32 SF offer the same level of performance. I guess the reason is that V3.1's output length is shorter than V3 averagely.
Ok, thanks a lot.