is it possible to use a previously downloaded HF .gguf file

Question

is it possible to use a previously downloaded HF .gguf file

cleesmith opened this issue 4 months ago · comments

First, this app works great on a MacBook Pro M3 Max 128GB and for lots of transformers and LLM models. One of the few RAG app's where I have been able to run it without the internet (well, once all of the models are downloaded), and using the terminal command "sudo pumas run" I can see it using 100% GPU (mps) during queries.
So thank you so much, and for your videos on YouTube.

Since I seem to be trying new RAG or fine-tuning app's so often, I have a lot of existing GGUF files from Hugging Face previously downloaded. Is there a way I/you/us can change this app to use any of those previous ".gguf" downloads. As it is time consuming to download the same stuff over and over again. I did notice that the "models" folder has file types other than just a ".gguf" file ... is there a way to convert previously downloaded gguf into the layout used in your "models" folder.

Please let me know and thanks again for this repo.

PromptEngineer · Answer 1 · Mon Feb 05 2024 13:46:14 GMT+0800 (China Standard Time)

Thank you and glad you are finding this useful. I am not sure, in the snapshots folder under every model that is downloaded, there is the main gguf file. The code is using llama-cpp-python (python binding) to download the file. This might be doing the conversion under the hood. Will need to look into that.

VerdonTrigance · Answer 2 · Thu Mar 07 2024 05:57:06 GMT+0800 (China Standard Time)

@cleesmith you may look at https://huggingface.co/docs/huggingface_hub/guides/manage-cache and setting HF_HOME environment variable. I personally did it and all my HF models are downloading there. But on windows it will keep warming you about symlinks and some other stuff. Anyway you may try it. You can also download huggingface-cli and manage downloads and cache from it.

Nitkarsh Chourasia · Answer 3 · Mon Apr 01 2024 05:59:16 GMT+0800 (China Standard Time)

@VerdonTrigance There is a PR done for using symlinks without any errors or bugs being shown.
You can look into it.
The title of the PR has symlink in it.
Thank you.

randoentity · Answer 4 · Tue May 21 2024 20:18:06 GMT+0800 (China Standard Time)

I'm not sure if there is a better way but the only PR with symlink in the name I found was about ingesting documents, not reusing previously downloaded models. Here's how to do it for anyone still looking:

Example with TheBloke/Phind-CodeLlama-34B-v2-GGUF phind-codellama-34b-v2.Q6_K.gguf:

set MODEL_PATH in constants.py
get the latest commit hash from huggingface (here it is da37c48be3b0c6cd487fe05259521dc2824f5a5f)
mkdir --parents $MODEL_PATH/models--TheBloke--Phind-CodeLlama-34B-v2-GGUF/snapshots/da37c48be3b0c6cd487fe05259521dc2824f5a5f
put (or link) your gguf there