Mozilla-Ocho / llamafile

Distribute and run LLMs with a single file.

Home Page:https://llamafile.ai

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question

fakerybakery opened this issue · comments

Hi, thank you for releasing llamafile! The speedups are very impressive. Are there any plans to merge the improvements made here upstream into the llama.cpp repo?
Thanks!

commented

This was asked on twitter, Justine's answer was yes its in flight: https://twitter.com/JustineTunney/status/1783332508505194674

Thanks! (That was me on Twitter :) - but Llamafile is still much faster than llama.cpp. Does that mean not all improvements have been upstreamed?

Yes, we're happy to share our optimizations with llama.cpp. Here's two PRs I sent them, which got merged:

The reason why llamafile still continues to be faster than llama.cpp is because I've discovered even more performance opportunities in last few weeks since I published my blog https://justine.lol/matmul/ My latest tricks will be upstreamed too, however they're still awaiting approval.