Add a —scan-models to mlx_lm.server to check downloaded models
ivanfioravanti opened this issue · comments
With huggingface client we can use following to get list of mlx models downloaded locally and some stats around them:
pip install huggingface_hub[cli] huggingface-cli scan-cache | grep -E '^REPO|mlx'
We could:
- include a python only way to get list of models and some info in case
- leverage huggingface package to do that
- simply updated docs and mention how to use HF CLI to get this info and clean old models in case (I think this is good enough)
I'm open to having this in MLX LM / MLX Server if it's useful for people. We can use the huggingface_hub python API. I think the relevant one is this: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/manage-cache#scan-cache-from-python
is it ok for you adding dependency to huggingface_hub?
Yes, we already use it to download models: https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/utils.py#L16
Great! I will take this one. I was building a pure Python code to do this, this should be simpler.
I added a mlx_lm.model, maybe in future download functionalities can be managed through it.