ggerganov / ggml

Tensor library for machine learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CUDA implementation of ggml_clamp

jploski opened this issue · comments

ggml_clamp currently lacks a CUDA implementation, which is rather trivial, but something I ran into while porting MPT to llama.cpp. I have a ready implementation and will create a PR for it, referencing this issue.

Hmm, I'm a bit confused as the internals (e.g. of ggml_scale) on ggml master branch look different than on llama.cpp master. llama.cpp's implementation of ggml-cuda.cu seems more recent (and cleaner looking, too). Will ggml eventually catch up to llama.cpp? (In that case I would contribute the patch on llama.cpp instead.)

Most of the development of the CUDA backend happens in the llama.cpp repository, and the changes are merged here regularly. Opening the PR in the llama.cpp repository is fine, the changes will eventually end here. I see that the ggml_clamp implementation is already in your MPT PR in llama.cpp, so there is no need to open another PR here.

Most of the development of the CUDA backend happens in the llama.cpp repository, and the changes are merged here regularly. Opening the PR in the llama.cpp repository is fine, the changes will eventually end here. I see that the ggml_clamp implementation is already in your MPT PR in llama.cpp, so there is no need to open another PR here.

Thanks for clarifying - I thought it worked exactly the other way around.