Can I still use FP8 E5M2 KV Cache if my GPU capability is less than 8.9?

Question

Can I still use FP8 E5M2 KV Cache if my GPU capability is less than 8.9?

blacker521 opened this issue 17 days ago · comments

Your current environment

Can I still use FP8 E5M2 KV Cache if my GPU capability is less than 8.9? VLLM did not report any errors, showing the use of FP8 KV Cache, even though my GPU capability was less than 8.9

How would you like to use vllm

use vllm in A10

Vectory · Answer 1 · Thu May 16 2024 09:37:22 GMT+0800 (China Standard Time)

"the FP8 KV Cache just does compression/decompression and doesn't require any real hardware support to do so".This sentence dispelled my doubts