mlx-community/Mixtral-8x7B-Instruct-v0.1-hf-4bit-mlx does not generate tokens

Question

mlx-community/Mixtral-8x7B-Instruct-v0.1-hf-4bit-mlx does not generate tokens

daboe01 opened this issue 2 months ago · comments

daboe01 commented 2 months ago

daboe01 · Answer 1 · Thu May 30 2024 04:58:50 GMT+0800 (China Standard Time)

generating ends silently without any output

daboe01 · Answer 2 · Thu May 30 2024 04:59:40 GMT+0800 (China Standard Time)

mlx-community/Phi-3-medium-128k-instruct-4bit runs fine on the same installation.

Tony Salomone · Answer 3 · Fri May 31 2024 01:50:28 GMT+0800 (China Standard Time)

I suspect this is a timeout and we just aren't handling well. The mixtral model, even at 4bit, is about 26GB vs. Phi 3 4-bit at about 8 GB.
On my machine (M3 Max 36 GB) this model is very slow. It looks like it only gets a few words out before timeout at which point the app appears to clear out whatever progress had been returned.