Can I still use FP8 E5M2 KV Cache if my GPU capability is less than 8.9?
blacker521 opened this issue · comments
Vectory commented
Your current environment
Can I still use FP8 E5M2 KV Cache if my GPU capability is less than 8.9? VLLM did not report any errors, showing the use of FP8 KV Cache, even though my GPU capability was less than 8.9
How would you like to use vllm
use vllm in A10
Vectory commented
"the FP8 KV Cache just does compression/decompression and doesn't require any real hardware support to do so".This sentence dispelled my doubts