Is there anybody who successfully imported llama-3-8b-web?
Bill-XU opened this issue · comments
I followed instructions described here https://github.com/ollama/ollama/blob/main/docs/import.md.
Converted this model using options "--ctx 8192 --outtype f16 --vocab-type bpe" and quantized the result with option "q4_0". Both ended successfully.
But when using ollama to run the result, I got "Error: llama runner process no longer running: -1".
Is there anybody who successfully imported and run it?
Best regards, Bill
Hi @Bill-XU. Sorry you hit an error. May I ask which model and/or model architecture you tried converting? Logs with a more specific error should be available here
Hi @jmorganca
Here are the details.
=== My server spec
OS: Ubuntu 22.04 LTS
Hard: 2 cpus / 8 GB memory (no GPU)
=== Ollama usage
Ollama service installed
(+ Open WebUI on Docker)
=== Steps of importing llama-3-8b-web
1. Downloaded assets from WebLlama on huggingface.co (Github: https://github.com/McGill-NLP/webllama)
2. Followed instructions of Importing (PyTorch & Safetensors)
3. During "Convert the model", used command "python llm/llama.cpp/convert.py /var/lib/custom_models/llama-3-8b-web --outtype f16 --outfile llama-3-8b-web.bin --ctx 8192 --vocab-type bpe"
4. and used "llm/llama.cpp/quantize llama-3-8b-web.bin llama-3-8b-web-quantized.bin q4_0" to quantize the model
5. Both step 3 and 4 succeeded
6. Used "ollama create llama-3-8b-web -f llama-3-8b-web.modelfile" to create the model, no error
7. When executing "ollama run llama-3-8b-web", it says "Error: llama runner process no longer running: -1"
=== Logs
ollama.log
It seems that the result of conversion was incorrect or broken?
Best regards,
Bill