Compares with tvm
lucasjinreal opened this issue · comments
MagicSource commented
does it runs faster than torch or tvm now? or llama.cpp
Joe Fioti commented
Currently faster than pytorch for llms on metal, about 10-20% slower than llama.cpp, unsure about tvm. Proper benchmarks (#21 ) are a goal I want to get done soon.
MagicSource commented
that's pretty good, how about on CUDA?