Support for AMD‘s ROCm

Question

Support for AMD‘s ROCm

riverzhou opened this issue a year ago · comments

RiverZhou commented a year ago

Offical llama.cpp is already support ROCm, when will qwen.cpp support ROCm?

XUHE · Answer 1 · Sat Nov 25 2023 10:31:42 GMT+0800 (China Standard Time)

https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models

RiverZhou · Answer 2 · Thu Nov 30 2023 18:55:07 GMT+0800 (China Standard Time)

https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models

Thanks!

RiverZhou · Answer 3 · Fri Dec 01 2023 16:47:26 GMT+0800 (China Standard Time)

I modified ggml framework and make it support ROCm.
And add ROCm support for qwen.cpp

https://github.com/riverzhou/qwen.cpp

Test passed on my 7800 XT and speed is around 37 tokens/second on 14B Q5_1 model.

RiverZhou · Answer 4 · Fri Dec 01 2023 17:03:28 GMT+0800 (China Standard Time)

I take a pull requests to upstream ggml and it's merged just now.
For now, just add

if (GGML_HIPBLAS)
  add_compile_definitions(GGML_USE_HIPBLAS GGML_USE_CUBLAS)
  set_property(TARGET ggml PROPERTY AMDGPU_TARGETS ${AMDGPU_TARGETS})
endif()

to Qwen's CMakeLists.txt and update ggml, it will support AMD's ROCm.

louwangzhiyuY · Answer 5 · Mon Dec 04 2023 15:01:02 GMT+0800 (China Standard Time)

can it support AMD Rocm on Windows?