"llama runner process has terminated: exit status 2" on Ryzen 5600/Arc A770
berendvosmer opened this issue · comments
Describe the bug
When following the steps at ipex-llm/docs/mddocs/Quickstart/ollama_portable_zip_quickstart.md, I get "Error: llama runner process has terminated: exit status 2" when running the model in step 3.
How to reproduce
Steps to reproduce the error:
Follow the steps at ipex-llm/docs/mddocs/Quickstart/ollama_portable_zip_quickstart.md. At step 3 I get:
ggml_sycl_init: found 1 SYCL devices:
pulling manifest
pulling 96c415656d37... 100% ▕███████████████████████████████████████████████████████████▏ 4.7 GB
pulling 369ca498f347... 100% ▕███████████████████████████████████████████████████████████▏ 387 B
pulling 6e4c38e1172f... 100% ▕███████████████████████████████████████████████████████████▏ 1.1 KB
pulling f4d24e9138dd... 100% ▕███████████████████████████████████████████████████████████▏ 148 B
pulling 40fb844194b2... 100% ▕███████████████████████████████████████████████████████████▏ 487 B
verifying sha256 digest
writing manifest
success
Error: llama runner process has terminated: exit status 2
Screenshots
Output from back-end:
time=2025-03-31T13:44:51.807+02:00 level=INFO source=images.go:757 msg="total blobs: 0"
time=2025-03-31T13:44:51.807+02:00 level=INFO source=images.go:764 msg="total unused blobs removed: 0"
[GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.
[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
- using env: export GIN_MODE=release
- using code: gin.SetMode(gin.ReleaseMode)
[GIN-debug] POST /api/pull --> ollama/server.(*Server).PullHandler-fm (5 handlers)
[GIN-debug] POST /api/generate --> ollama/server.(*Server).GenerateHandler-fm (5 handlers)
[GIN-debug] POST /api/chat --> ollama/server.(*Server).ChatHandler-fm (5 handlers)
[GIN-debug] POST /api/embed --> ollama/server.(*Server).EmbedHandler-fm (5 handlers)
[GIN-debug] POST /api/embeddings --> ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers)
[GIN-debug] POST /api/create --> ollama/server.(*Server).CreateHandler-fm (5 handlers)
[GIN-debug] POST /api/push --> ollama/server.(*Server).PushHandler-fm (5 handlers)
[GIN-debug] POST /api/copy --> ollama/server.(*Server).CopyHandler-fm (5 handlers)
[GIN-debug] DELETE /api/delete --> ollama/server.(*Server).DeleteHandler-fm (5 handlers)
[GIN-debug] POST /api/show --> ollama/server.(*Server).ShowHandler-fm (5 handlers)
[GIN-debug] POST /api/blobs/:digest --> ollama/server.(*Server).CreateBlobHandler-fm (5 handlers)
[GIN-debug] HEAD /api/blobs/:digest --> ollama/server.(*Server).HeadBlobHandler-fm (5 handlers)
[GIN-debug] GET /api/ps --> ollama/server.(*Server).PsHandler-fm (5 handlers)
[GIN-debug] POST /v1/chat/completions --> ollama/server.(*Server).ChatHandler-fm (6 handlers)
[GIN-debug] POST /v1/completions --> ollama/server.(*Server).GenerateHandler-fm (6 handlers)
[GIN-debug] POST /v1/embeddings --> ollama/server.(*Server).EmbedHandler-fm (6 handlers)
[GIN-debug] GET /v1/models --> ollama/server.(*Server).ListHandler-fm (6 handlers)
[GIN-debug] GET /v1/models/:model --> ollama/server.(*Server).ShowHandler-fm (6 handlers)
[GIN-debug] GET / --> ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
[GIN-debug] GET /api/tags --> ollama/server.(*Server).ListHandler-fm (5 handlers)
[GIN-debug] GET /api/version --> ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
[GIN-debug] HEAD / --> ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
[GIN-debug] HEAD /api/tags --> ollama/server.(*Server).ListHandler-fm (5 handlers)
[GIN-debug] HEAD /api/version --> ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
time=2025-03-31T13:44:51.807+02:00 level=INFO source=routes.go:1310 msg="Listening on 127.0.0.1:11434 (version 0.5.4-ipexllm-20250318)"
time=2025-03-31T13:44:51.807+02:00 level=INFO source=routes.go:1339 msg="Dynamic LLM libraries" runners=[ipex_llm]
Then on running './ollama run deepseek-r1:7b' I get:
[GIN] 2025/03/31 - 13:46:18 | 200 | 42.481µs | 127.0.0.1 | HEAD "/"
[GIN] 2025/03/31 - 13:46:18 | 404 | 211.595µs | 127.0.0.1 | POST "/api/show"
[GIN] 2025/03/31 - 13:46:19 | 200 | 396.736393ms | 127.0.0.1 | POST "/api/pull"
[GIN] 2025/03/31 - 13:46:47 | 200 | 20.87µs | 127.0.0.1 | HEAD "/"
[GIN] 2025/03/31 - 13:46:47 | 404 | 95.762µs | 127.0.0.1 | POST "/api/show"
[GIN] 2025/03/31 - 13:46:47 | 200 | 635.81324ms | 127.0.0.1 | POST "/api/pull"
[GIN] 2025/03/31 - 13:46:56 | 200 | 20.93µs | 127.0.0.1 | HEAD "/"
[GIN] 2025/03/31 - 13:46:56 | 404 | 87.742µs | 127.0.0.1 | POST "/api/show"
time=2025-03-31T13:46:57.565+02:00 level=INFO source=download.go:175 msg="downloading 96c415656d37 in 16 292 MB part(s)"
time=2025-03-31T13:47:39.903+02:00 level=INFO source=download.go:175 msg="downloading 369ca498f347 in 1 387 B part(s)"
time=2025-03-31T13:47:41.288+02:00 level=INFO source=download.go:175 msg="downloading 6e4c38e1172f in 1 1.1 KB part(s)"
time=2025-03-31T13:47:42.629+02:00 level=INFO source=download.go:175 msg="downloading f4d24e9138dd in 1 148 B part(s)"
time=2025-03-31T13:47:43.959+02:00 level=INFO source=download.go:175 msg="downloading 40fb844194b2 in 1 487 B part(s)"
[GIN] 2025/03/31 - 13:47:47 | 200 | 50.813429899s | 127.0.0.1 | POST "/api/pull"
[GIN] 2025/03/31 - 13:47:47 | 200 | 13.468933ms | 127.0.0.1 | POST "/api/show"
time=2025-03-31T13:47:47.620+02:00 level=INFO source=gpu.go:226 msg="looking for compatible GPUs"
time=2025-03-31T13:47:47.649+02:00 level=INFO source=server.go:104 msg="system memory" total="31.3 GiB" free="28.8 GiB" free_swap="8.0 GiB"
time=2025-03-31T13:47:47.649+02:00 level=INFO source=memory.go:356 msg="offload to device" layers.requested=-1 layers.model=29 layers.offload=0 layers.split="" memory.available="[28.9 GiB]" memory.gpu_overhead="0 B" memory.required.full="4.6 GiB" memory.required.partial="0 B" memory.required.kv="112.0 MiB" memory.required.allocations="[4.6 GiB]" memory.weights.total="3.8 GiB" memory.weights.repeating="3.3 GiB" memory.weights.nonrepeating="426.4 MiB" memory.graph.full="304.0 MiB" memory.graph.partial="730.4 MiB"
time=2025-03-31T13:47:47.650+02:00 level=INFO source=server.go:392 msg="starting llama server" cmd="/home/berend/ipex-llm/ollama-ipex-llm-2.2.0b20250318-ubuntu/ollama-bin runner --model /home/berend/.ollama/models/blobs/sha256-96c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49 --ctx-size 2048 --batch-size 512 --n-gpu-layers 999 --threads 6 --no-mmap --parallel 1 --port 38205"
time=2025-03-31T13:47:47.650+02:00 level=INFO source=sched.go:449 msg="loaded runners" count=1
time=2025-03-31T13:47:47.650+02:00 level=INFO source=server.go:571 msg="waiting for llama runner to start responding"
time=2025-03-31T13:47:47.650+02:00 level=INFO source=server.go:605 msg="waiting for server to become available" status="llm server error"
ggml_sycl_init: found 1 SYCL devices:
time=2025-03-31T13:47:47.706+02:00 level=INFO source=runner.go:967 msg="starting go runner"
time=2025-03-31T13:47:47.707+02:00 level=INFO source=runner.go:968 msg=system info="CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | OPENMP = 1 | AARCH64_REPACK = 1 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | OPENMP = 1 | AARCH64_REPACK = 1 | cgo(gcc)" threads=6
time=2025-03-31T13:47:47.707+02:00 level=INFO source=runner.go:1026 msg="Server listening on 127.0.0.1:38205"
llama_load_model_from_file: using device SYCL0 (Intel(R) Arc(TM) A770 Graphics) - 15473 MiB free
llama_model_loader: loaded meta data with 26 key-value pairs and 339 tensors from /home/berend/.ollama/models/blobs/sha256-96c415656d377afbff962f6cdb2394ab092ccbcbaab4b82525bc4ca800fe8a49 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = DeepSeek R1 Distill Qwen 7B
llama_model_loader: - kv 3: general.basename str = DeepSeek-R1-Distill-Qwen
llama_model_loader: - kv 4: general.size_label str = 7B
llama_model_loader: - kv 5: qwen2.block_count u32 = 28
llama_model_loader: - kv 6: qwen2.context_length u32 = 131072
llama_model_loader: - kv 7: qwen2.embedding_length u32 = 3584
llama_model_loader: - kv 8: qwen2.feed_forward_length u32 = 18944
llama_model_loader: - kv 9: qwen2.attention.head_count u32 = 28
llama_model_loader: - kv 10: qwen2.attention.head_count_kv u32 = 4
llama_model_loader: - kv 11: qwen2.rope.freq_base f32 = 10000.000000
llama_model_loader: - kv 12: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 13: general.file_type u32 = 15
llama_model_loader: - kv 14: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 15: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 16: tokenizer.ggml.tokens arr[str,152064] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 17: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
time=2025-03-31T13:47:47.901+02:00 level=INFO source=server.go:605 msg="waiting for server to become available" status="llm server loading model"
llama_model_loader: - kv 18: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 19: tokenizer.ggml.bos_token_id u32 = 151646
llama_model_loader: - kv 20: tokenizer.ggml.eos_token_id u32 = 151643
llama_model_loader: - kv 21: tokenizer.ggml.padding_token_id u32 = 151643
llama_model_loader: - kv 22: tokenizer.ggml.add_bos_token bool = true
llama_model_loader: - kv 23: tokenizer.ggml.add_eos_token bool = false
llama_model_loader: - kv 24: tokenizer.chat_template str = {% if not add_generation_prompt is de...
llama_model_loader: - kv 25: general.quantization_version u32 = 2
llama_model_loader: - type f32: 141 tensors
llama_model_loader: - type q4_K: 169 tensors
llama_model_loader: - type q6_K: 29 tensors
llm_load_vocab: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect
llm_load_vocab: special tokens cache size = 22
llm_load_vocab: token to piece cache size = 0.9310 MB
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = qwen2
llm_load_print_meta: vocab type = BPE
llm_load_print_meta: n_vocab = 152064
llm_load_print_meta: n_merges = 151387
llm_load_print_meta: vocab_only = 0
llm_load_print_meta: n_ctx_train = 131072
llm_load_print_meta: n_embd = 3584
llm_load_print_meta: n_layer = 28
llm_load_print_meta: n_head = 28
llm_load_print_meta: n_head_kv = 4
llm_load_print_meta: n_rot = 128
llm_load_print_meta: n_swa = 0
llm_load_print_meta: n_embd_head_k = 128
llm_load_print_meta: n_embd_head_v = 128
llm_load_print_meta: n_gqa = 7
llm_load_print_meta: n_embd_k_gqa = 512
llm_load_print_meta: n_embd_v_gqa = 512
llm_load_print_meta: f_norm_eps = 0.0e+00
llm_load_print_meta: f_norm_rms_eps = 1.0e-06
llm_load_print_meta: f_clamp_kqv = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale = 0.0e+00
llm_load_print_meta: f_attn_scale = 0.0e+00
llm_load_print_meta: n_ff = 18944
llm_load_print_meta: n_expert = 0
llm_load_print_meta: n_expert_used = 0
llm_load_print_meta: causal attn = 1
llm_load_print_meta: pooling type = 0
llm_load_print_meta: rope type = 2
llm_load_print_meta: rope scaling = linear
llm_load_print_meta: freq_base_train = 10000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn = 131072
llm_load_print_meta: rope_finetuned = unknown
llm_load_print_meta: ssm_d_conv = 0
llm_load_print_meta: ssm_d_inner = 0
llm_load_print_meta: ssm_d_state = 0
llm_load_print_meta: ssm_dt_rank = 0
llm_load_print_meta: ssm_dt_b_c_rms = 0
llm_load_print_meta: model type = 7B
llm_load_print_meta: model ftype = Q4_K - Medium
llm_load_print_meta: model params = 7.62 B
llm_load_print_meta: model size = 4.36 GiB (4.91 BPW)
llm_load_print_meta: general.name = DeepSeek R1 Distill Qwen 7B
llm_load_print_meta: BOS token = 151646 '<|begin▁of▁sentence|>'
llm_load_print_meta: EOS token = 151643 '<|end▁of▁sentence|>'
llm_load_print_meta: EOT token = 151643 '<|end▁of▁sentence|>'
llm_load_print_meta: PAD token = 151643 '<|end▁of▁sentence|>'
llm_load_print_meta: LF token = 148848 'ÄĬ'
llm_load_print_meta: FIM PRE token = 151659 '<|fim_prefix|>'
llm_load_print_meta: FIM SUF token = 151661 '<|fim_suffix|>'
llm_load_print_meta: FIM MID token = 151660 '<|fim_middle|>'
llm_load_print_meta: FIM PAD token = 151662 '<|fim_pad|>'
llm_load_print_meta: FIM REP token = 151663 '<|repo_name|>'
llm_load_print_meta: FIM SEP token = 151664 '<|file_sep|>'
llm_load_print_meta: EOG token = 151643 '<|end▁of▁sentence|>'
llm_load_print_meta: EOG token = 151662 '<|fim_pad|>'
llm_load_print_meta: EOG token = 151663 '<|repo_name|>'
llm_load_print_meta: EOG token = 151664 '<|file_sep|>'
llm_load_print_meta: max token length = 256
llm_load_tensors: offloading 28 repeating layers to GPU
llm_load_tensors: offloading output layer to GPU
llm_load_tensors: offloaded 29/29 layers to GPU
llm_load_tensors: SYCL0 model buffer size = 4168.09 MiB
llm_load_tensors: CPU model buffer size = 292.36 MiB
SIGBUS: bus error
PC=0x7beae8588d07 m=4 sigcode=2 addr=0x7be93ed13000
signal arrived during cgo execution
goroutine 11 gp=0xc0005048c0 m=4 mp=0xc000083508 [syscall]:
runtime.cgocall(0x650f1ac5a400, 0xc000091b68)
runtime/cgocall.go:167 +0x4b fp=0xc000091b40 sp=0xc000091b08 pc=0x650f1a0b7feb
ollama/llama/llamafile._Cfunc_llama_load_model_from_file(0x7bea80000be0, {0x0, 0x3e7, 0x1, 0x0, 0x0, 0x0, 0x650f1ac59e10, 0xc000694010, 0x0, ...})
_cgo_gotypes.go:701 +0x50 fp=0xc000091b68 sp=0xc000091b40 pc=0x650f1a47adb0
ollama/llama/llamafile.LoadModelFromFile.func1({0x7ffc7e987f8d?, 0xc00050cd20?}, {0x0, 0x3e7, 0x1, 0x0, 0x0, 0x0, 0x650f1ac59e10, 0xc000694010, ...})
ollama/llama/llamafile/llama.go:247 +0x127 fp=0xc000091c68 sp=0xc000091b68 pc=0x650f1a47e1e7
ollama/llama/llamafile.LoadModelFromFile({0x7ffc7e987f8d, 0x69}, {0x3e7, 0x0, 0x0, 0x0, {0x0, 0x0, 0x0}, 0xc00061e640, ...})
ollama/llama/llamafile/llama.go:247 +0x2d6 fp=0xc000091db8 sp=0xc000091c68 pc=0x650f1a47ded6
ollama/llama/runner.(*Server).loadModel(0xc0004cc120, {0x3e7, 0x0, 0x0, 0x0, {0x0, 0x0, 0x0}, 0xc00061e640, 0x0}, ...)
ollama/llama/runner/runner.go:859 +0xc5 fp=0xc000091f10 sp=0xc000091db8 pc=0x650f1a48b085
ollama/llama/runner.Execute.gowrap1()
ollama/llama/runner/runner.go:1001 +0xda fp=0xc000091fe0 sp=0xc000091f10 pc=0x650f1a48cc1a
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc000091fe8 sp=0xc000091fe0 pc=0x650f1a0c6ac1
created by ollama/llama/runner.Execute in goroutine 1
ollama/llama/runner/runner.go:1001 +0xd0d
goroutine 1 gp=0xc0000061c0 m=nil [IO wait]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc000587560 sp=0xc000587540 pc=0x650f1a0be6ee
runtime.netpollblock(0x21f80?, 0x1a055506?, 0xf?)
runtime/netpoll.go:575 +0xf7 fp=0xc000587598 sp=0xc000587560 pc=0x650f1a082357
internal/poll.runtime_pollWait(0x7beae9acd680, 0x72)
runtime/netpoll.go:351 +0x85 fp=0xc0005875b8 sp=0xc000587598 pc=0x650f1a0bd9e5
internal/poll.(*pollDesc).wait(0xc000118b00?, 0x2c?, 0x0)
internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0005875e0 sp=0xc0005875b8 pc=0x650f1a145007
internal/poll.(*pollDesc).waitRead(...)
internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Accept(0xc000118b00)
internal/poll/fd_unix.go:620 +0x295 fp=0xc000587688 sp=0xc0005875e0 pc=0x650f1a14a3d5
net.(*netFD).accept(0xc000118b00)
net/fd_unix.go:172 +0x29 fp=0xc000587740 sp=0xc000587688 pc=0x650f1a1b2aa9
net.(*TCPListener).accept(0xc0006385c0)
net/tcpsock_posix.go:159 +0x1e fp=0xc000587790 sp=0xc000587740 pc=0x650f1a1c871e
net.(*TCPListener).Accept(0xc0006385c0)
net/tcpsock.go:372 +0x30 fp=0xc0005877c0 sp=0xc000587790 pc=0x650f1a1c75d0
net/http.(*onceCloseListener).Accept(0xc000020000?)
<autogenerated>:1 +0x24 fp=0xc0005877d8 sp=0xc0005877c0 pc=0x650f1a440d24
net/http.(*Server).Serve(0xc000140960, {0x650f1b208ee0, 0xc0006385c0})
net/http/server.go:3330 +0x30c fp=0xc000587908 sp=0xc0005877d8 pc=0x650f1a418cac
ollama/llama/runner.Execute({0xc000036130?, 0x0?, 0x0?})
ollama/llama/runner/runner.go:1027 +0x11a9 fp=0xc000587ca8 sp=0xc000587908 pc=0x650f1a48c7e9
ollama/cmd.NewCLI.func2(0xc00050e500?, {0x650f1ac5ed1d?, 0x4?, 0x650f1ac5ed21?})
ollama/cmd/cmd.go:1430 +0x45 fp=0xc000587cd0 sp=0xc000587ca8 pc=0x650f1ac594e5
github.com/spf13/cobra.(*Command).execute(0xc0004ca008, {0xc0001402d0, 0xf, 0xf})
github.com/spf13/cobra@v1.8.1/command.go:985 +0xaaa fp=0xc000587e58 sp=0xc000587cd0 pc=0x650f1a24be8a
github.com/spf13/cobra.(*Command).ExecuteC(0xc0001f5b08)
github.com/spf13/cobra@v1.8.1/command.go:1117 +0x3ff fp=0xc000587f30 sp=0xc000587e58 pc=0x650f1a24c75f
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/cobra@v1.8.1/command.go:1041
github.com/spf13/cobra.(*Command).ExecuteContext(...)
github.com/spf13/cobra@v1.8.1/command.go:1034
main.main()
ollama/main.go:12 +0x4d fp=0xc000587f50 sp=0xc000587f30 pc=0x650f1ac59b4d
runtime.main()
runtime/proc.go:272 +0x29d fp=0xc000587fe0 sp=0xc000587f50 pc=0x650f1a0899fd
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc000587fe8 sp=0xc000587fe0 pc=0x650f1a0c6ac1
goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00007cfa8 sp=0xc00007cf88 pc=0x650f1a0be6ee
runtime.goparkunlock(...)
runtime/proc.go:430
runtime.forcegchelper()
runtime/proc.go:337 +0xb8 fp=0xc00007cfe0 sp=0xc00007cfa8 pc=0x650f1a089d38
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00007cfe8 sp=0xc00007cfe0 pc=0x650f1a0c6ac1
created by runtime.init.7 in goroutine 1
runtime/proc.go:325 +0x1a
goroutine 3 gp=0xc000007180 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00007d780 sp=0xc00007d760 pc=0x650f1a0be6ee
runtime.goparkunlock(...)
runtime/proc.go:430
runtime.bgsweep(0xc00004c080)
runtime/mgcsweep.go:317 +0xdf fp=0xc00007d7c8 sp=0xc00007d780 pc=0x650f1a0743df
runtime.gcenable.gowrap1()
runtime/mgc.go:204 +0x25 fp=0xc00007d7e0 sp=0xc00007d7c8 pc=0x650f1a068a25
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00007d7e8 sp=0xc00007d7e0 pc=0x650f1a0c6ac1
created by runtime.gcenable in goroutine 1
runtime/mgc.go:204 +0x66
goroutine 4 gp=0xc000007340 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x650f1ae04ed8?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00007df78 sp=0xc00007df58 pc=0x650f1a0be6ee
runtime.goparkunlock(...)
runtime/proc.go:430
runtime.(*scavengerState).park(0x650f1b9a2da0)
runtime/mgcscavenge.go:425 +0x49 fp=0xc00007dfa8 sp=0xc00007df78 pc=0x650f1a071da9
runtime.bgscavenge(0xc00004c080)
runtime/mgcscavenge.go:658 +0x59 fp=0xc00007dfc8 sp=0xc00007dfa8 pc=0x650f1a072339
runtime.gcenable.gowrap2()
runtime/mgc.go:205 +0x25 fp=0xc00007dfe0 sp=0xc00007dfc8 pc=0x650f1a0689c5
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00007dfe8 sp=0xc00007dfe0 pc=0x650f1a0c6ac1
created by runtime.gcenable in goroutine 1
runtime/mgc.go:205 +0xa5
goroutine 5 gp=0xc000007c00 m=nil [finalizer wait]:
runtime.gopark(0xc00007c648?, 0x650f1a05ef25?, 0xb0?, 0x1?, 0xc0000061c0?)
runtime/proc.go:424 +0xce fp=0xc00007c620 sp=0xc00007c600 pc=0x650f1a0be6ee
runtime.runfinq()
runtime/mfinal.go:193 +0x107 fp=0xc00007c7e0 sp=0xc00007c620 pc=0x650f1a067aa7
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00007c7e8 sp=0xc00007c7e0 pc=0x650f1a0c6ac1
created by runtime.createfing in goroutine 1
runtime/mfinal.go:163 +0x3d
goroutine 6 gp=0xc0001f2e00 m=nil [chan receive]:
runtime.gopark(0xc00007e760?, 0x650f1a19a125?, 0x40?, 0xe8?, 0x650f1b21c400?)
runtime/proc.go:424 +0xce fp=0xc00007e718 sp=0xc00007e6f8 pc=0x650f1a0be6ee
runtime.chanrecv(0xc00004a310, 0x0, 0x1)
runtime/chan.go:639 +0x41c fp=0xc00007e790 sp=0xc00007e718 pc=0x650f1a05811c
runtime.chanrecv1(0x0?, 0x0?)
runtime/chan.go:489 +0x12 fp=0xc00007e7b8 sp=0xc00007e790 pc=0x650f1a057cd2
runtime.unique_runtime_registerUniqueMapCleanup.func1(...)
runtime/mgc.go:1781
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
runtime/mgc.go:1784 +0x2f fp=0xc00007e7e0 sp=0xc00007e7b8 pc=0x650f1a06ba8f
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00007e7e8 sp=0xc00007e7e0 pc=0x650f1a0c6ac1
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
runtime/mgc.go:1779 +0x96
goroutine 7 gp=0xc0001f3a40 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00007ef38 sp=0xc00007ef18 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc00007efc8 sp=0xc00007ef38 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc00007efe0 sp=0xc00007efc8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00007efe8 sp=0xc00007efe0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 18 gp=0xc000104380 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc000078738 sp=0xc000078718 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc0000787c8 sp=0xc000078738 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc0000787e0 sp=0xc0000787c8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc0000787e8 sp=0xc0000787e0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 34 gp=0xc000504000 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00050a738 sp=0xc00050a718 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc00050a7c8 sp=0xc00050a738 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc00050a7e0 sp=0xc00050a7c8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00050a7e8 sp=0xc00050a7e0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 8 gp=0xc0001f3c00 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00007f738 sp=0xc00007f718 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc00007f7c8 sp=0xc00007f738 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc00007f7e0 sp=0xc00007f7c8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00007f7e8 sp=0xc00007f7e0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 19 gp=0xc000104540 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc000078f38 sp=0xc000078f18 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc000078fc8 sp=0xc000078f38 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc000078fe0 sp=0xc000078fc8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc000078fe8 sp=0xc000078fe0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 35 gp=0xc0005041c0 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00050af38 sp=0xc00050af18 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc00050afc8 sp=0xc00050af38 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc00050afe0 sp=0xc00050afc8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00050afe8 sp=0xc00050afe0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 9 gp=0xc0004ac1c0 m=nil [GC worker (idle)]:
runtime.gopark(0x4c7ab775934?, 0x3?, 0x5f?, 0x92?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00007ff38 sp=0xc00007ff18 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc00007ffc8 sp=0xc00007ff38 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc00007ffe0 sp=0xc00007ffc8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00007ffe8 sp=0xc00007ffe0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 10 gp=0xc0004ac380 m=nil [GC worker (idle)]:
runtime.gopark(0x4c7ab777fae?, 0x3?, 0x73?, 0x83?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc000506738 sp=0xc000506718 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc0005067c8 sp=0xc000506738 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc0005067e0 sp=0xc0005067c8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc0005067e8 sp=0xc0005067e0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 36 gp=0xc000504380 m=nil [GC worker (idle)]:
runtime.gopark(0x4c7ab775cb8?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00050b738 sp=0xc00050b718 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc00050b7c8 sp=0xc00050b738 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc00050b7e0 sp=0xc00050b7c8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00050b7e8 sp=0xc00050b7e0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 37 gp=0xc000504540 m=nil [GC worker (idle)]:
runtime.gopark(0x4c7ab781541?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00050bf38 sp=0xc00050bf18 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc00050bfc8 sp=0xc00050bf38 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc00050bfe0 sp=0xc00050bfc8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00050bfe8 sp=0xc00050bfe0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 38 gp=0xc000504700 m=nil [GC worker (idle)]:
runtime.gopark(0x650f1b9cc920?, 0x1?, 0x12?, 0x3e?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00050c738 sp=0xc00050c718 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc00050c7c8 sp=0xc00050c738 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc00050c7e0 sp=0xc00050c7c8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00050c7e8 sp=0xc00050c7e0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 20 gp=0xc000104700 m=nil [GC worker (idle)]:
runtime.gopark(0x4c7ab7744da?, 0x0?, 0x0?, 0x0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc000079738 sp=0xc000079718 pc=0x650f1a0be6ee
runtime.gcBgMarkWorker(0xc00004b730)
runtime/mgc.go:1412 +0xe9 fp=0xc0000797c8 sp=0xc000079738 pc=0x650f1a06ad89
runtime.gcBgMarkStartWorkers.gowrap1()
runtime/mgc.go:1328 +0x25 fp=0xc0000797e0 sp=0xc0000797c8 pc=0x650f1a06ac65
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc0000797e8 sp=0xc0000797e0 pc=0x650f1a0c6ac1
created by runtime.gcBgMarkStartWorkers in goroutine 1
runtime/mgc.go:1328 +0x105
goroutine 12 gp=0xc000504a80 m=nil [semacquire]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0xa0?, 0x0?)
runtime/proc.go:424 +0xce fp=0xc00050d618 sp=0xc00050d5f8 pc=0x650f1a0be6ee
runtime.goparkunlock(...)
runtime/proc.go:430
runtime.semacquire1(0xc0004cc128, 0x0, 0x1, 0x0, 0x12)
runtime/sema.go:178 +0x22c fp=0xc00050d680 sp=0xc00050d618 pc=0x650f1a09caac
sync.runtime_Semacquire(0x0?)
runtime/sema.go:71 +0x25 fp=0xc00050d6b8 sp=0xc00050d680 pc=0x650f1a0bff05
sync.(*WaitGroup).Wait(0x0?)
sync/waitgroup.go:118 +0x48 fp=0xc00050d6e0 sp=0xc00050d6b8 pc=0x650f1a0d52e8
ollama/llama/runner.(*Server).run(0xc0004cc120, {0x650f1b20b1d0, 0xc0006240f0})
ollama/llama/runner/runner.go:315 +0x47 fp=0xc00050d7b8 sp=0xc00050d6e0 pc=0x650f1a487707
ollama/llama/runner.Execute.gowrap2()
ollama/llama/runner/runner.go:1006 +0x28 fp=0xc00050d7e0 sp=0xc00050d7b8 pc=0x650f1a48cb08
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc00050d7e8 sp=0xc00050d7e0 pc=0x650f1a0c6ac1
created by ollama/llama/runner.Execute in goroutine 1
ollama/llama/runner/runner.go:1006 +0xde5
goroutine 50 gp=0xc000604700 m=nil [IO wait]:
runtime.gopark(0x650f1a148605?, 0xc0001aa000?, 0x10?, 0x7a?, 0xb?)
runtime/proc.go:424 +0xce fp=0xc0002d7918 sp=0xc0002d78f8 pc=0x650f1a0be6ee
runtime.netpollblock(0x650f1a0e1918?, 0x1a055506?, 0xf?)
runtime/netpoll.go:575 +0xf7 fp=0xc0002d7950 sp=0xc0002d7918 pc=0x650f1a082357
internal/poll.runtime_pollWait(0x7beae9acd568, 0x72)
runtime/netpoll.go:351 +0x85 fp=0xc0002d7970 sp=0xc0002d7950 pc=0x650f1a0bd9e5
internal/poll.(*pollDesc).wait(0xc0001aa000?, 0xc0001fc000?, 0x0)
internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0002d7998 sp=0xc0002d7970 pc=0x650f1a145007
internal/poll.(*pollDesc).waitRead(...)
internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc0001aa000, {0xc0001fc000, 0x1000, 0x1000})
internal/poll/fd_unix.go:165 +0x27a fp=0xc0002d7a30 sp=0xc0002d7998 pc=0x650f1a1462fa
net.(*netFD).Read(0xc0001aa000, {0xc0001fc000?, 0xc0002d7aa0?, 0x650f1a1454c5?})
net/fd_posix.go:55 +0x25 fp=0xc0002d7a78 sp=0xc0002d7a30 pc=0x650f1a1b0ae5
net.(*conn).Read(0xc000120088, {0xc0001fc000?, 0x0?, 0xc0003220c8?})
net/net.go:189 +0x45 fp=0xc0002d7ac0 sp=0xc0002d7a78 pc=0x650f1a1bf0e5
net.(*TCPConn).Read(0xc0003220c0?, {0xc0001fc000?, 0xc0001aa000?, 0xc0002d7af8?})
<autogenerated>:1 +0x25 fp=0xc0002d7af0 sp=0xc0002d7ac0 pc=0x650f1a1d22e5
net/http.(*connReader).Read(0xc0003220c0, {0xc0001fc000, 0x1000, 0x1000})
net/http/server.go:798 +0x14b fp=0xc0002d7b40 sp=0xc0002d7af0 pc=0x650f1a40ea6b
bufio.(*Reader).fill(0xc000112060)
bufio/bufio.go:110 +0x103 fp=0xc0002d7b78 sp=0xc0002d7b40 pc=0x650f1a1d69e3
bufio.(*Reader).Peek(0xc000112060, 0x4)
bufio/bufio.go:148 +0x53 fp=0xc0002d7b98 sp=0xc0002d7b78 pc=0x650f1a1d6b13
net/http.(*conn).serve(0xc000020000, {0x650f1b20b198, 0xc0001c6420})
net/http/server.go:2127 +0x738 fp=0xc0002d7fb8 sp=0xc0002d7b98 pc=0x650f1a413db8
net/http.(*Server).Serve.gowrap3()
net/http/server.go:3360 +0x28 fp=0xc0002d7fe0 sp=0xc0002d7fb8 pc=0x650f1a4190a8
runtime.goexit({})
runtime/asm_amd64.s:1700 +0x1 fp=0xc0002d7fe8 sp=0xc0002d7fe0 pc=0x650f1a0c6ac1
created by net/http.(*Server).Serve in goroutine 1
net/http/server.go:3360 +0x485
rax 0x7be93ed13000
rbx 0x7bea8287b390
rcx 0x3800
rdx 0x3800
rdi 0x7be93ed13000
rsi 0x7bea82877b80
rbp 0x7bea8affb580
rsp 0x7bea8affb398
r8 0x7be93ed13000
r9 0x13330b000
r10 0x1
r11 0x246
r12 0x7bea82877b80
r13 0x7be93ed13000
r14 0x8287e400
r15 0x7bea80b25c60
rip 0x7beae8588d07
rflags 0x10206
cs 0x33
fs 0x0
gs 0x0
time=2025-03-31T13:47:48.158+02:00 level=INFO source=server.go:605 msg="waiting for server to become available" status="llm server error"
time=2025-03-31T13:47:48.409+02:00 level=ERROR source=sched.go:455 msg="error loading llama server" error="llama runner process has terminated: exit status 2"```
**Environment information**
Output of env-check.sh:
```-----------------------------------------------------------------
-----------------------------------------------------------------
Transformers is not installed.
-----------------------------------------------------------------
PyTorch is not installed.
-----------------------------------------------------------------
ipex-llm ./env-check.sh: line 41: pip: command not found
-----------------------------------------------------------------
IPEX is not installed.
-----------------------------------------------------------------
CPU Information:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 5 5600 6-Core Processor
CPU family: 25
Model: 33
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 1
Stepping: 2
Frequency boost: enabled
CPU(s) scaling MHz: 52%
CPU max MHz: 4468.0000
-----------------------------------------------------------------
Total CPU Memory: 31.2684 GB
-----------------------------------------------------------------
Operating System:
Ubuntu 24.04.2 LTS \n \l
-----------------------------------------------------------------
Linux adamantium 6.11.0-21-generic #21~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Mon Feb 24 16:52:15 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
-----------------------------------------------------------------
./env-check.sh: line 131: xpu-smi: command not found
-----------------------------------------------------------------
Driver UUID 32352e30-352e-3332-3536-370000000000
Driver Version 25.05.32567
-----------------------------------------------------------------
Driver related package version:
-----------------------------------------------------------------
./env-check.sh: line 150: sycl-ls: command not found
igpu not detected
-----------------------------------------------------------------
xpu-smi is not installed. Please install xpu-smi according to README.md
Additional context
I previously also ran containers with ipex-llm, ollama and open-webui and would get the same result. When changing OLLAMA_NUM_GPU from 999 to 1 would not give an error, but would offload just 1 layer to the GPU.
Could this be related to issue #12992 ?
Resizable BAR was disabled in the BIOS. Will try again and report.
Resizable bar set to 'auto' in the BIOS solved this issue for me. #12992