Add a —scan-models to mlx_lm.server to check downloaded models

Question

Add a —scan-models to mlx_lm.server to check downloaded models

ivanfioravanti opened this issue 3 months ago · comments

With huggingface client we can use following to get list of mlx models downloaded locally and some stats around them:
pip install huggingface_hub[cli] huggingface-cli scan-cache | grep -E '^REPO|mlx'

We could:

include a python only way to get list of models and some info in case
leverage huggingface package to do that
simply updated docs and mention how to use HF CLI to get this info and clean old models in case (I think this is good enough)

Awni Hannun · Answer 1 · Sun Apr 28 2024 21:45:15 GMT+0800 (China Standard Time)

I'm open to having this in MLX LM / MLX Server if it's useful for people. We can use the huggingface_hub python API. I think the relevant one is this: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/manage-cache#scan-cache-from-python

Ivan Fioravanti · Answer 2 · Sun Apr 28 2024 22:01:30 GMT+0800 (China Standard Time)

is it ok for you adding dependency to huggingface_hub?

Awni Hannun · Answer 3 · Sun Apr 28 2024 22:04:05 GMT+0800 (China Standard Time)

Yes, we already use it to download models: https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/utils.py#L16

Ivan Fioravanti · Answer 4 · Sun Apr 28 2024 22:08:04 GMT+0800 (China Standard Time)

Great! I will take this one. I was building a pure Python code to do this, this should be simpler.

Ivan Fioravanti · Answer 5 · Mon Apr 29 2024 00:57:23 GMT+0800 (China Standard Time)

I added a mlx_lm.model, maybe in future download functionalities can be managed through it.