ollama / ollama

Get up and running with Llama 3.1, Mistral, Gemma 2, and other large language models.

Home Page:https://ollama.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Model Cold Storage and user manual management possibility

nikhil-swamix opened this issue · comments

image

model management

its nearly impossible to manage models by manual method, and it generates hash values,

what i was trying to do was to move some models to cold storage, ie HDD, and some to SSD. but couldnt find a way rather than full repo movement , and im faced with this

image

its take lot of time to do this type of management due to sheer volume of transfer, and why does one model generate 100s of blobs? cant it be stored to a folder per model rather than littering everywhere? my best bet is to check date modified time and perform the job.

proposed

ollama archive <model_name> <Disk_or_path>

ollama pull <Disk_or_path> will show options which models to revive to cache.

urgent request

observations

some users reported that ollama pull takes long time #2850 #5361 etc, i suspect the nature of SSDs to avoid creation of huge reserved space, as first few seconds in task manager it shown 1GB/s , then fall to 200, the pathetic 5MB/s. it could be the write protection mechanism. maybe chunked download and merge? or use huggingface like loader like _part001 _part002 for layers loading?

@bmizerany @drnic @anaisbetts @sqs @lstep

@nikhil-swamix Please do not CC random people in GitHub issues. I am not a maintainer on this project.

commented

A model shouldn't generate 100s of blobs. Do the files have a suffix? That might indicate a failed download, re-pulling the model might help.

All of the blobs that make up a model are listed in the model manifest, a simple shell script would be able to use that information to manage the blobs (eg, copy to a new location and symlink back to to the blob store).

I have noticed the transfer slow down you mention, it looks to me like that traffic is being rate limited at the source. I find that if I stop and restart the pull, the download rate saturates my link.