Question

Question

Question

fakerybakery opened this issue a month ago · comments

Hi, thank you for releasing llamafile! The speedups are very impressive. Are there any plans to merge the improvements made here upstream into the llama.cpp repo?
Thanks!

Tom · Answer 1 · Thu Apr 25 2024 13:38:59 GMT+0800 (China Standard Time)

This was asked on twitter, Justine's answer was yes its in flight: https://twitter.com/JustineTunney/status/1783332508505194674

mrfakename · Answer 2 · Fri Apr 26 2024 00:44:06 GMT+0800 (China Standard Time)

Thanks! (That was me on Twitter :) - but Llamafile is still much faster than llama.cpp. Does that mean not all improvements have been upstreamed?

Justine Tunney · Answer 3 · Sat Apr 27 2024 02:51:00 GMT+0800 (China Standard Time)

Yes, we're happy to share our optimizations with llama.cpp. Here's two PRs I sent them, which got merged:

The reason why llamafile still continues to be faster than llama.cpp is because I've discovered even more performance opportunities in last few weeks since I published my blog https://justine.lol/matmul/ My latest tricks will be upstreamed too, however they're still awaiting approval.