transformerlab / transformerlab-app

Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.

Home Page:https://transformerlab.ai/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

mlx-community/Mixtral-8x7B-Instruct-v0.1-hf-4bit-mlx does not generate tokens

daboe01 opened this issue · comments

generating ends silently without any output

mlx-community/Phi-3-medium-128k-instruct-4bit runs fine on the same installation.

I suspect this is a timeout and we just aren't handling well. The mixtral model, even at 4bit, is about 26GB vs. Phi 3 4-bit at about 8 GB.
On my machine (M3 Max 36 GB) this model is very slow. It looks like it only gets a few words out before timeout at which point the app appears to clear out whatever progress had been returned.