mlx-community/Mixtral-8x7B-Instruct-v0.1-hf-4bit-mlx does not generate tokens
daboe01 opened this issue · comments
generating ends silently without any output
mlx-community/Phi-3-medium-128k-instruct-4bit runs fine on the same installation.
I suspect this is a timeout and we just aren't handling well. The mixtral model, even at 4bit, is about 26GB vs. Phi 3 4-bit at about 8 GB.
On my machine (M3 Max 36 GB) this model is very slow. It looks like it only gets a few words out before timeout at which point the app appears to clear out whatever progress had been returned.