Evaluation run for all "good open weight models" with all available quantizations and different GPUs
zimmski opened this issue · comments
Markus Zimmermann commented
Not sure on how we should do that yet. CPU-only-inference will break us here, and speed-metrics are important as well.