Question
fakerybakery opened this issue · comments
Hi, thank you for releasing llamafile! The speedups are very impressive. Are there any plans to merge the improvements made here upstream into the llama.cpp repo?
Thanks!
This was asked on twitter, Justine's answer was yes its in flight: https://twitter.com/JustineTunney/status/1783332508505194674
Thanks! (That was me on Twitter :) - but Llamafile is still much faster than llama.cpp. Does that mean not all improvements have been upstreamed?
Yes, we're happy to share our optimizations with llama.cpp. Here's two PRs I sent them, which got merged:
The reason why llamafile still continues to be faster than llama.cpp is because I've discovered even more performance opportunities in last few weeks since I published my blog https://justine.lol/matmul/ My latest tricks will be upstreamed too, however they're still awaiting approval.