QwenLM / qwen.cpp

C++ implementation of Qwen-LM

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Support for AMD‘s ROCm

riverzhou opened this issue · comments

Offical llama.cpp is already support ROCm, when will qwen.cpp support ROCm?

commented

https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models

https://github.com/YellowRoseCx/koboldcpp-rocm this project can use hipBLAS on windows for GGML and GGUF models

Thanks!

image

I modified ggml framework and make it support ROCm.
And add ROCm support for qwen.cpp

https://github.com/riverzhou/qwen.cpp

Test passed on my 7800 XT and speed is around 37 tokens/second on 14B Q5_1 model.

I take a pull requests to upstream ggml and it's merged just now.
For now, just add

if (GGML_HIPBLAS)
  add_compile_definitions(GGML_USE_HIPBLAS GGML_USE_CUBLAS)
  set_property(TARGET ggml PROPERTY AMDGPU_TARGETS ${AMDGPU_TARGETS})
endif()

to Qwen's CMakeLists.txt and update ggml, it will support AMD's ROCm.

can it support AMD Rocm on Windows?