intel / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, DeepSeek, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, DeepSpeed, Axolotl, etc.

Repository from Github https://github.comintel/ipex-llmRepository from Github https://github.comintel/ipex-llm

ipex-llm will run slower and slower

dttprofessor opened this issue · comments

SYSTEM: u265K(no igpu)+b580
QUESTION: ollama run qwen:7b for translate web ,After translating 300-500 sentences, the translation speed will gradually slow down, and the GPU utilization rate will drop from 90% to below 10%. Restarting ollama serve will speed it up again, but the slowdown will eventually occur once more.The same situation may also occur in chat services; after ipexllm runs for a while, the GPU utilization rate will become slower.

Similiar issue: #12852

Hi @dttprofessor, I can't reproduce your issue. Could you provide the ollama serve log, also the GPU performance and driver version from the Task Manager when the performance issue occurs?

B580 driver: 32.0.101.6559

This is a web-based translation case. After opening ollama for web translation and translating 300-500 sentences, the GPU utilization rate will continuously decrease and remain around 10% later on, unless ollama is restarted or the model is automatically flushed from the GPU memory by ollama every 5 minutes. When using the translation service again after this, the GPU utilization rate will quickly rise to over 90% and maintain for several minutes.

ggml_sycl_init: GGML_SYCL_FORCE_MMQ: no
ggml_sycl_init: SYCL_USE_XMX: yes
ggml_sycl_init: found 1 SYCL devices:
2025/02/25 11:14:06 routes.go:1259: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY:localhost,127.0.0.1 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:10m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\Users\zhaoy\.ollama\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES:]"
time=2025-02-25T11:14:06.876+08:00 level=INFO source=images.go:757 msg="total blobs: 12"
time=2025-02-25T11:14:06.877+08:00 level=INFO source=images.go:764 msg="total unused blobs removed: 0"
[GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.

[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.

  • using env: export GIN_MODE=release
  • using code: gin.SetMode(gin.ReleaseMode)

[GIN-debug] POST /api/pull --> ollama/server.(*Server).PullHandler-fm (5 handlers)
[GIN-debug] POST /api/generate --> ollama/server.(*Server).GenerateHandler-fm (5 handlers)
[GIN-debug] POST /api/chat --> ollama/server.(*Server).ChatHandler-fm (5 handlers)
[GIN-debug] POST /api/embed --> ollama/server.(*Server).EmbedHandler-fm (5 handlers)
[GIN-debug] POST /api/embeddings --> ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers)
[GIN-debug] POST /api/create --> ollama/server.(*Server).CreateHandler-fm (5 handlers)
[GIN-debug] POST /api/push --> ollama/server.(*Server).PushHandler-fm (5 handlers)
[GIN-debug] POST /api/copy --> ollama/server.(*Server).CopyHandler-fm (5 handlers)
[GIN-debug] DELETE /api/delete --> ollama/server.(*Server).DeleteHandler-fm (5 handlers)
[GIN-debug] POST /api/show --> ollama/server.(*Server).ShowHandler-fm (5 handlers)
[GIN-debug] POST /api/blobs/:digest --> ollama/server.(*Server).CreateBlobHandler-fm (5 handlers)
[GIN-debug] HEAD /api/blobs/:digest --> ollama/server.(*Server).HeadBlobHandler-fm (5 handlers)
[GIN-debug] GET /api/ps --> ollama/server.(*Server).PsHandler-fm (5 handlers)
[GIN-debug] POST /v1/chat/completions --> ollama/server.(*Server).ChatHandler-fm (6 handlers)
[GIN-debug] POST /v1/completions --> ollama/server.(*Server).GenerateHandler-fm (6 handlers)
[GIN-debug] POST /v1/embeddings --> ollama/server.(*Server).EmbedHandler-fm (6 handlers)
[GIN-debug] GET /v1/models --> ollama/server.(*Server).ListHandler-fm (6 handlers)
[GIN-debug] GET /v1/models/:model --> ollama/server.(*Server).ShowHandler-fm (6 handlers)
[GIN-debug] GET / --> ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
[GIN-debug] GET /api/tags --> ollama/server.(*Server).ListHandler-fm (5 handlers)
[GIN-debug] GET /api/version --> ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
[GIN-debug] HEAD / --> ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
[GIN-debug] HEAD /api/tags --> ollama/server.(*Server).ListHandler-fm (5 handlers)
[GIN-debug] HEAD /api/version --> ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
time=2025-02-25T11:14:06.888+08:00 level=INFO source=routes.go:1310 msg="Listening on [::]:11434 (version 0.5.4-ipexllm-20250220)"
time=2025-02-25T11:14:06.888+08:00 level=INFO source=routes.go:1339 msg="Dynamic LLM libraries" runners=[ipex_llm]
time=2025-02-25T11:14:44.164+08:00 level=INFO source=gpu.go:226 msg="looking for compatible GPUs"
time=2025-02-25T11:14:44.165+08:00 level=INFO source=gpu_windows.go:167 msg=packages count=1
time=2025-02-25T11:14:44.165+08:00 level=INFO source=gpu_windows.go:183 msg="efficiency cores detected" maxEfficiencyClass=1
time=2025-02-25T11:14:44.165+08:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=20 efficiency=12 threads=20
time=2025-02-25T11:14:44.183+08:00 level=INFO source=server.go:104 msg="system memory" total="47.6 GiB" free="37.2 GiB" free_swap="36.6 GiB"
time=2025-02-25T11:14:44.183+08:00 level=INFO source=memory.go:356 msg="offload to device" layers.requested=-1 layers.model=29 layers.offload=0 layers.split="" memory.available="[37.2 GiB]" memory.gpu_overhead="0 B" memory.required.full="4.6 GiB" memory.required.partial="0 B" memory.required.kv="112.0 MiB" memory.required.allocations="[4.6 GiB]" memory.weights.total="3.8 GiB" memory.weights.repeating="3.3 GiB" memory.weights.nonrepeating="426.4 MiB" memory.graph.full="304.0 MiB" memory.graph.partial="730.4 MiB"
time=2025-02-25T11:14:44.190+08:00 level=INFO source=server.go:392 msg="starting llama server" cmd="E:\ollama-0.5.4-ipex-llm-2.2.0b20250213\ollama-lib.exe runner --model C:\Users\zhaoy\.ollama\models\blobs\sha256-2bada8a7450677000f678be90653b85d364de7db25eb5ea54136ada5f3933730 --ctx-size 2048 --batch-size 512 --n-gpu-layers 999 --threads 8 --no-mmap --parallel 1 --port 50773"
time=2025-02-25T11:14:44.195+08:00 level=INFO source=sched.go:449 msg="loaded runners" count=1
time=2025-02-25T11:14:44.197+08:00 level=INFO source=server.go:571 msg="waiting for llama runner to start responding"
time=2025-02-25T11:14:44.198+08:00 level=INFO source=server.go:605 msg="waiting for server to become available" status="llm server error"
ggml_sycl_init: GGML_SYCL_FORCE_MMQ: no
ggml_sycl_init: SYCL_USE_XMX: yes
ggml_sycl_init: found 1 SYCL devices:
time=2025-02-25T11:14:44.363+08:00 level=INFO source=runner.go:967 msg="starting go runner"
time=2025-02-25T11:14:44.366+08:00 level=INFO source=runner.go:968 msg=system info="CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | OPENMP = 1 | AARCH64_REPACK = 1 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | LLAMAFILE = 1 | OPENMP = 1 | AARCH64_REPACK = 1 | cgo(clang)" threads=8
time=2025-02-25T11:14:44.366+08:00 level=INFO source=runner.go:1026 msg="Server listening on 127.0.0.1:50773"
time=2025-02-25T11:14:44.449+08:00 level=INFO source=server.go:605 msg="waiting for server to become available" status="llm server loading model"
llama_load_model_from_file: using device SYCL0 (Intel(R) Arc(TM) B580 Graphics) - 11349 MiB free
llama_model_loader: loaded meta data with 34 key-value pairs and 339 tensors from C:\Users\zhaoy.ollama\models\blobs\sha256-2bada8a7450677000f678be90653b85d364de7db25eb5ea54136ada5f3933730 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Qwen2.5 7B Instruct
llama_model_loader: - kv 3: general.finetune str = Instruct
llama_model_loader: - kv 4: general.basename str = Qwen2.5
llama_model_loader: - kv 5: general.size_label str = 7B
llama_model_loader: - kv 6: general.license str = apache-2.0
llama_model_loader: - kv 7: general.license.link str = https://huggingface.co/Qwen/Qwen2.5-7...
llama_model_loader: - kv 8: general.base_model.count u32 = 1
llama_model_loader: - kv 9: general.base_model.0.name str = Qwen2.5 7B
llama_model_loader: - kv 10: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 11: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen2.5-7B
llama_model_loader: - kv 12: general.tags arr[str,2] = ["chat", "text-generation"]
llama_model_loader: - kv 13: general.languages arr[str,1] = ["en"]
llama_model_loader: - kv 14: qwen2.block_count u32 = 28
llama_model_loader: - kv 15: qwen2.context_length u32 = 32768
llama_model_loader: - kv 16: qwen2.embedding_length u32 = 3584
llama_model_loader: - kv 17: qwen2.feed_forward_length u32 = 18944
llama_model_loader: - kv 18: qwen2.attention.head_count u32 = 28
llama_model_loader: - kv 19: qwen2.attention.head_count_kv u32 = 4
llama_model_loader: - kv 20: qwen2.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 21: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 22: general.file_type u32 = 15
llama_model_loader: - kv 23: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 24: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 25: tokenizer.ggml.tokens arr[str,152064] = ["!", """, "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 26: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 27: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 28: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 29: tokenizer.ggml.padding_token_id u32 = 151643
llama_model_loader: - kv 30: tokenizer.ggml.bos_token_id u32 = 151643
llama_model_loader: - kv 31: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 32: tokenizer.chat_template str = {%- if tools %}\n {{- '<|im_start|>...
llama_model_loader: - kv 33: general.quantization_version u32 = 2
llama_model_loader: - type f32: 141 tensors
llama_model_loader: - type q4_K: 169 tensors
llama_model_loader: - type q6_K: 29 tensors
llm_load_vocab: special tokens cache size = 22
llm_load_vocab: token to piece cache size = 0.9310 MB
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = qwen2
llm_load_print_meta: vocab type = BPE
llm_load_print_meta: n_vocab = 152064
llm_load_print_meta: n_merges = 151387
llm_load_print_meta: vocab_only = 0
llm_load_print_meta: n_ctx_train = 32768
llm_load_print_meta: n_embd = 3584
llm_load_print_meta: n_layer = 28
llm_load_print_meta: n_head = 28
llm_load_print_meta: n_head_kv = 4
llm_load_print_meta: n_rot = 128
llm_load_print_meta: n_swa = 0
llm_load_print_meta: n_embd_head_k = 128
llm_load_print_meta: n_embd_head_v = 128
llm_load_print_meta: n_gqa = 7
llm_load_print_meta: n_embd_k_gqa = 512
llm_load_print_meta: n_embd_v_gqa = 512
llm_load_print_meta: f_norm_eps = 0.0e+00
llm_load_print_meta: f_norm_rms_eps = 1.0e-06
llm_load_print_meta: f_clamp_kqv = 0.0e+00
llm_load_print_meta: f_max_alibi_bias = 0.0e+00
llm_load_print_meta: f_logit_scale = 0.0e+00
llm_load_print_meta: n_ff = 18944
llm_load_print_meta: n_expert = 0
llm_load_print_meta: n_expert_used = 0
llm_load_print_meta: causal attn = 1
llm_load_print_meta: pooling type = 0
llm_load_print_meta: rope type = 2
llm_load_print_meta: rope scaling = linear
llm_load_print_meta: freq_base_train = 1000000.0
llm_load_print_meta: freq_scale_train = 1
llm_load_print_meta: n_ctx_orig_yarn = 32768
llm_load_print_meta: rope_finetuned = unknown
llm_load_print_meta: ssm_d_conv = 0
llm_load_print_meta: ssm_d_inner = 0
llm_load_print_meta: ssm_d_state = 0
llm_load_print_meta: ssm_dt_rank = 0
llm_load_print_meta: ssm_dt_b_c_rms = 0
llm_load_print_meta: model type = 7B
llm_load_print_meta: model ftype = Q4_K - Medium
llm_load_print_meta: model params = 7.62 B
llm_load_print_meta: model size = 4.36 GiB (4.91 BPW)
llm_load_print_meta: general.name = Qwen2.5 7B Instruct
llm_load_print_meta: BOS token = 151643 '<|endoftext|>'
llm_load_print_meta: EOS token = 151645 '<|im_end|>'
llm_load_print_meta: EOT token = 151645 '<|im_end|>'
llm_load_print_meta: PAD token = 151643 '<|endoftext|>'
llm_load_print_meta: LF token = 148848 'ÄĬ'
llm_load_print_meta: FIM PRE token = 151659 '<|fim_prefix|>'
llm_load_print_meta: FIM SUF token = 151661 '<|fim_suffix|>'
llm_load_print_meta: FIM MID token = 151660 '<|fim_middle|>'
llm_load_print_meta: FIM PAD token = 151662 '<|fim_pad|>'
llm_load_print_meta: FIM REP token = 151663 '<|repo_name|>'
llm_load_print_meta: FIM SEP token = 151664 '<|file_sep|>'
llm_load_print_meta: EOG token = 151643 '<|endoftext|>'
llm_load_print_meta: EOG token = 151645 '<|im_end|>'
llm_load_print_meta: EOG token = 151662 '<|fim_pad|>'
llm_load_print_meta: EOG token = 151663 '<|repo_name|>'
llm_load_print_meta: EOG token = 151664 '<|file_sep|>'
llm_load_print_meta: max token length = 256
llm_load_tensors: offloading 28 repeating layers to GPU
llm_load_tensors: offloading output layer to GPU
llm_load_tensors: offloaded 29/29 layers to GPU
llm_load_tensors: CPU model buffer size = 292.36 MiB
llm_load_tensors: SYCL0 model buffer size = 4168.09 MiB
llama_new_context_with_model: n_seq_max = 1
llama_new_context_with_model: n_ctx = 2048
llama_new_context_with_model: n_ctx_per_seq = 2048
llama_new_context_with_model: n_batch = 512
llama_new_context_with_model: n_ubatch = 512
llama_new_context_with_model: flash_attn = 0
llama_new_context_with_model: freq_base = 1000000.0
llama_new_context_with_model: freq_scale = 1
llama_new_context_with_model: n_ctx_per_seq (2048) < n_ctx_train (32768) -- the full capacity of the model will not be utilized
[SYCL] call ggml_check_sycl
ggml_check_sycl: GGML_SYCL_DEBUG: 0
ggml_check_sycl: GGML_SYCL_F16: no
Found 1 SYCL devices:
| | | | |Max | |Max |Global |
|
| | | | |compute|Max work|sub |mem |
|

ID Device Type Name Version units group group size Driver version
0 [level_zero:gpu:0] Intel Arc B580 Graphics 20.1 160 1024 32 12450M 1.6.31896
llama_kv_cache_init: SYCL0 KV buffer size = 112.00 MiB
llama_new_context_with_model: KV self size = 112.00 MiB, K (f16): 56.00 MiB, V (f16): 56.00 MiB
llama_new_context_with_model: SYCL_Host output buffer size = 0.59 MiB
llama_new_context_with_model: SYCL0 compute buffer size = 304.00 MiB
llama_new_context_with_model: SYCL_Host compute buffer size = 11.01 MiB
llama_new_context_with_model: graph nodes = 874
llama_new_context_with_model: graph splits = 2
time=2025-02-25T11:14:50.445+08:00 level=WARN source=runner.go:892 msg="%s: warming up the model with an empty run - please wait ... " !BADKEY=loadModel
time=2025-02-25T11:14:50.456+08:00 level=INFO source=server.go:610 msg="llama runner started in 6.26 seconds"
llama_load_model_from_file: using device SYCL0 (Intel(R) Arc(TM) B580 Graphics) - 6677 MiB free
llama_model_loader: loaded meta data with 34 key-value pairs and 339 tensors from C:\Users\zhaoy.ollama\models\blobs\sha256-2bada8a7450677000f678be90653b85d364de7db25eb5ea54136ada5f3933730 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Qwen2.5 7B Instruct
llama_model_loader: - kv 3: general.finetune str = Instruct
llama_model_loader: - kv 4: general.basename str = Qwen2.5
llama_model_loader: - kv 5: general.size_label str = 7B
llama_model_loader: - kv 6: general.license str = apache-2.0
llama_model_loader: - kv 7: general.license.link str = https://huggingface.co/Qwen/Qwen2.5-7...
llama_model_loader: - kv 8: general.base_model.count u32 = 1
llama_model_loader: - kv 9: general.base_model.0.name str = Qwen2.5 7B
llama_model_loader: - kv 10: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 11: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen2.5-7B
llama_model_loader: - kv 12: general.tags arr[str,2] = ["chat", "text-generation"]
llama_model_loader: - kv 13: general.languages arr[str,1] = ["en"]
llama_model_loader: - kv 14: qwen2.block_count u32 = 28
llama_model_loader: - kv 15: qwen2.context_length u32 = 32768
llama_model_loader: - kv 16: qwen2.embedding_length u32 = 3584
llama_model_loader: - kv 17: qwen2.feed_forward_length u32 = 18944
llama_model_loader: - kv 18: qwen2.attention.head_count u32 = 28
llama_model_loader: - kv 19: qwen2.attention.head_count_kv u32 = 4
llama_model_loader: - kv 20: qwen2.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 21: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 22: general.file_type u32 = 15
llama_model_loader: - kv 23: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 24: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 25: tokenizer.ggml.tokens arr[str,152064] = ["!", """, "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 26: tokenizer.ggml.token_type arr[i32,152064] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 27: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 28: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 29: tokenizer.ggml.padding_token_id u32 = 151643
llama_model_loader: - kv 30: tokenizer.ggml.bos_token_id u32 = 151643
llama_model_loader: - kv 31: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 32: tokenizer.chat_template str = {%- if tools %}\n {{- '< im_start >...
llama_model_loader: - kv 33: general.quantization_version u32 = 2
llama_model_loader: - type f32: 141 tensors
llama_model_loader: - type q4_K: 169 tensors
llama_model_loader: - type q6_K: 29 tensors
llm_load_vocab: special tokens cache size = 22
llm_load_vocab: token to piece cache size = 0.9310 MB
llm_load_print_meta: format = GGUF V3 (latest)
llm_load_print_meta: arch = qwen2
llm_load_print_meta: vocab type = BPE
llm_load_print_meta: n_vocab = 152064
llm_load_print_meta: n_merges = 151387
llm_load_print_meta: vocab_only = 1
llm_load_print_meta: model type = ?B
llm_load_print_meta: model ftype = all F32
llm_load_print_meta: model params = 7.62 B
llm_load_print_meta: model size = 4.36 GiB (4.91 BPW)
llm_load_print_meta: general.name = Qwen2.5 7B Instruct
llm_load_print_meta: BOS token = 151643 '< endoftext >'
llm_load_print_meta: EOS token = 151645 '< im_end >'
llm_load_print_meta: EOT token = 151645 '< im_end >'
llm_load_print_meta: PAD token = 151643 '< endoftext >'
llm_load_print_meta: LF token = 148848 'ÄĬ'
llm_load_print_meta: FIM PRE token = 151659 '< fim_prefix >'
llm_load_print_meta: FIM SUF token = 151661 '< fim_suffix >'
llm_load_print_meta: FIM MID token = 151660 '< fim_middle >'
llm_load_print_meta: FIM PAD token = 151662 '< fim_pad >'
llm_load_print_meta: FIM REP token = 151663 '< repo_name >'
llm_load_print_meta: FIM SEP token = 151664 '< file_sep >'
llm_load_print_meta: EOG token = 151643 '< endoftext >'
llm_load_print_meta: EOG token = 151645 '< im_end >'
llm_load_print_meta: EOG token = 151662 '< fim_pad >'
llm_load_print_meta: EOG token = 151663 '< repo_name >'
llm_load_print_meta: EOG token = 151664 '< file_sep >'
llm_load_print_meta: max token length = 256
llama_model_load: vocab only - skipping tensors
[GIN] 2025/02/25 - 11:14:51 200 7.3287789s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:51 200 7.7660488s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:52 200 8.1537322s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:52 200 8.5751568s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:53 200 9.3208469s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:54 200 8.5818893s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:54 200 3.1685145s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:54 200 2.9136502s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:55 200 2.8777743s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:55 200 2.7798309s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:55 200 2.3701159s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:56 200 2.2785749s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:58 200 720.5379ms ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:14:59 200 401.2956ms ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:00 200 378.1226ms ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:00 200 471.0166ms ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:01 200 832.9038ms ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:01 200 1.1757508s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:02 200 1.5900977s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:02 200 991.134ms ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:02 200 1.0005446s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:03 200 1.3541482s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:03 200 1.7760152s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:04 200 2.120463s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:04 200 1.631s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:05 200 1.9400383s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:05 200 1.9215594s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:05 200 2.5425675s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:06 200 2.8455751s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:07 200 2.8244837s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:07 200 2.9364401s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:07 200 2.7880363s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:08 200 3.1578944s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:20 200 348.1918ms ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:20 200 409.8554ms ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:21 200 1.1202959s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:21 200 1.545127s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:22 200 2.1139904s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:22 200 1.358748s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:23 200 1.7437894s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:24 200 2.6147921s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:24 200 2.885427s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:25 200 3.880471s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:26 200 4.3742678s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:27 200 4.3372885s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:27 200 4.1536467s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:27 200 3.3630472s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:27 200 3.2511559s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:28 200 3.1283842s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:29 200 2.7487578s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:30 200 3.2603559s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:31 200 3.9963366s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:31 200 4.3508972s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:33 200 5.3521894s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:33 200 4.6856626s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:33 200 4.1007002s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:33 200 3.4244904s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:35 200 3.6901891s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:35 200 3.4055498s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:36 200 3.4035947s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:37 200 3.684267s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:37 200 3.9212442s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:37 200 4.0899752s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:38 200 3.1187585s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:39 200 4.1808474s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:40 200 3.6491381s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:40 200 3.5879736s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:41 200 4.0180902s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:41 200 3.9312461s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:42 200 3.9165153s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:42 200 3.4689561s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:43 200 3.2319786s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:43 200 2.9300819s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:44 200 2.686313s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:45 200 4.0103876s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:46 200 4.0846349s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:46 200 3.7475056s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:47 200 4.3349991s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:48 200 4.5125138s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:48 200 4.2197719s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:49 200 3.3614304s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:49 200 3.3545672s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:49 200 3.1069226s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:50 200 2.5748207s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:50 200 2.4657722s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:51 200 2.5136658s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:51 200 2.1071808s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:51 200 2.3943719s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:52 200 2.4245178s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:52 200 2.5446927s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:53 200 2.6983817s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:53 200 2.8222555s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:54 200 2.6905089s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:54 200 2.3492962s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:54 200 2.3145159s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:54 200 2.1366501s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:55 200 1.8342911s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:56 200 2.5966222s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:56 200 2.6912717s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:57 200 2.762379s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:57 200 2.8266039s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:57 200 2.7426701s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:58 200 3.2072983s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:58 200 2.2049323s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:58 200 2.1285441s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:59 200 2.1587104s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:15:59 200 2.131814s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:00 200 2.3926385s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:00 200 2.0529584s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:01 200 2.8087544s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:02 200 3.332472s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:03 200 4.0688238s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:03 200 4.239045s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:04 200 4.0156073s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:04 200 3.9169984s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:04 200 3.2201877s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:04 200 2.6854144s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:05 200 2.2159459s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:05 200 2.1780543s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:06 200 2.9269918s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:07 200 3.1783455s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:08 200 3.8954602s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:09 200 4.620672s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:09 200 4.3838891s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:10 200 4.9425375s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:11 200 4.2422786s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:12 200 4.696477s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:12 200 3.842262s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:12 200 3.163913s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:13 200 3.4431302s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:13 200 3.0791831s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:14 200 3.3543222s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:14 200 2.7808742s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:15 200 2.9432083s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:15 200 2.9252687s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:16 200 3.1867987s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:16 200 2.8839845s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:17 200 2.5568107s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:18 200 3.1153409s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:18 200 3.5147467s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:19 200 3.5777409s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:19 200 2.9754494s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:20 200 3.5322765s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:21 200 4.3537799s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:21 200 3.7589427s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:22 200 3.4179737s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:23 200 4.2725789s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:24 200 4.5443705s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:24 200 4.0515906s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:24 200 3.2055273s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:25 200 3.4763393s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:25 200 3.4742491s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:26 200 2.6709127s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:26 200 2.4379356s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:26 200 2.5748336s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:27 200 3.0448934s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:28 200 2.8037037s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:28 200 2.8234802s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:29 200 3.2476636s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:30 200 3.9308945s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:30 200 3.8345538s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:31 200 3.5478196s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:32 200 4.2336541s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:33 200 4.9042264s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:33 200 4.499395s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:35 200 4.6294695s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:35 200 5.1076614s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:36 200 4.9973103s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:37 200 4.772967s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:37 200 4.1922865s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:38 200 4.1901572s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:38 200 3.6158628s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:39 200 3.1744685s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:39 200 3.3317547s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:40 200 2.9956777s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:40 200 3.152895s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:41 200 3.2434441s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:43 200 4.8130413s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:43 200 4.7438913s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:44 200 5.0739855s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:45 200 5.0808494s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:46 200 5.4784387s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:46 200 5.4944096s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:47 200 4.4545001s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:48 200 4.9436465s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:49 200 5.1605982s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:51 200 6.5445117s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:52 200 5.9307908s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:52 200 5.9519718s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:53 200 5.6882864s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:54 200 5.3100138s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:54 200 4.8571701s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:55 200 3.4415903s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:56 200 3.7622437s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:56 200 3.7019384s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:56 200 3.3260867s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:57 200 3.3680107s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:57 200 3.214365s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:58 200 3.2543558s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:58 200 2.8938954s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:59 200 2.9510391s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:59 200 2.8623678s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:16:59 200 2.4400354s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:00 200 2.3560764s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:00 200 2.3737694s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:01 200 2.4067866s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:01 200 2.4311728s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:02 200 2.5459707s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:02 200 2.9080402s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:03 200 2.9182617s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:03 200 2.8075584s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:03 200 2.3575367s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:04 200 2.2325207s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:04 200 2.1253963s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:05 200 2.2333797s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:05 200 2.2280289s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:05 200 2.2485598s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:05 200 2.2489291s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:06 200 2.2664296s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:06 200 1.9344187s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:06 200 1.4634154s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:07 200 1.5858191s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:07 200 1.6091603s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:07 200 1.9835791s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:08 200 1.9948573s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:08 200 2.3695913s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:09 200 2.75802s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:09 200 2.67091s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:10 200 2.7548245s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:10 200 2.7602995s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:11 200 2.7612592s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:11 200 2.7775988s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:12 200 2.781349s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:12 200 2.7953287s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:12 200 2.7189966s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:13 200 2.8189255s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:13 200 2.8356893s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:14 200 2.8484626s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:14 200 2.8607541s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:15 200 2.8710874s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:15 200 2.8872445s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:16 200 2.8186049s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:17 200 3.4323406s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:17 200 3.4555136s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:18 200 3.5145741s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:18 200 3.5430541s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:19 200 3.5416953s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:19 200 3.604365s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:20 200 3.0186653s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:20 200 2.9490801s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:21 200 2.9115433s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:21 200 2.5159598s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:21 200 2.1473705s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:21 200 1.7108981s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:21 200 1.3264835s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:21 200 1.006696s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:22 200 944.4729ms ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:22 200 1.4217277s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:23 200 1.9326533s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:24 200 2.4219449s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:24 200 2.9034424s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:25 200 3.3947065s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:25 200 3.5595377s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:26 200 3.4239676s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:26 200 3.3286596s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:27 200 3.2544269s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:27 200 3.2104753s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:28 200 3.1811755s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:28 200 3.0925191s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:29 200 3.1656976s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:29 200 3.1480499s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:30 200 3.2688738s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:31 200 3.353886s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:31 200 3.4083021s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:32 200 3.5149174s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:33 200 3.6046139s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:33 200 3.7172501s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:34 200 3.7089865s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:34 200 3.7385975s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:35 200 3.7895172s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:36 200 3.837671s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:36 200 3.8933518s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:37 200 3.9407762s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:38 200 4.0082156s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:38 200 4.0528437s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:39 200 4.0691012s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:40 200 4.1719543s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:41 200 4.202792s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:41 200 4.2494851s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:42 200 4.2786256s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:43 200 4.398167s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:44 200 4.3967059s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:44 200 4.30934s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:45 200 4.3009498s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:46 200 4.2705515s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:46 200 4.2360316s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:47 200 4.2825981s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:48 200 4.2998308s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:49 200 4.3241622s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:49 200 4.3424416s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:50 200 4.3778738s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:51 200 4.4000197s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:51 200 4.1452345s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:52 200 4.105176s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:53 200 4.063726s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:53 200 3.513973s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:53 200 3.4644139s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:54 200 3.4153562s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:55 200 3.4757236s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:55 200 3.378654s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:56 200 3.3708334s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:57 200 3.8638079s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:57 200 3.8378307s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:58 200 3.8265539s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:59 200 3.811862s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:17:59 200 3.899666s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:00 200 3.9563941s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:01 200 4.0055831s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:01 200 4.0838198s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:02 200 4.1678717s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:03 200 4.2317442s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:04 200 4.3014197s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:04 200 4.315982s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:05 200 4.2060935s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:05 200 4.0935418s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:07 200 4.4291895s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:07 200 4.6056324s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:08 200 4.5530671s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:09 200 4.4877802s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:09 200 4.5369549s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:10 200 4.608181s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:11 200 4.1771962s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:11 200 3.9816587s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:12 200 4.2060804s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:13 200 4.1988493s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:14 200 4.3410146s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:16 200 5.8054451s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:17 200 6.279348s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:18 200 6.6583982s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:19 200 6.6708593s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:20 200 6.9708172s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:21 200 7.685376s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:22 200 6.1803733s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:23 200 6.2296895s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:24 200 6.0350571s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:25 200 6.1692328s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:26 200 6.2989719s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:27 200 5.7703929s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:28 200 6.1839021s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:30 200 6.4129527s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:31 200 6.5224096s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:32 200 6.6077015s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:33 200 6.5540363s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:34 200 6.9826138s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:35 200 6.9845954s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:37 200 7.2698806s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:39 200 8.1010262s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:40 200 8.088533s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:41 200 8.2017785s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:42 200 7.7761493s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:43 200 7.9077091s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:45 200 7.7802369s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:46 200 7.3721747s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:48 200 8.0672822s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:49 200 8.3351721s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:52 200 9.7971273s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:53 200 10.3232958s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:55 200 10.3166223s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:57 200 10.5012662s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:18:59 200 10.856734s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:00 200 11.1219878s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:02 200 10.2785776s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:04 200 10.1486509s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:05 200 9.5956031s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:06 200 9.2515675s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:07 200 8.3275915s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:08 200 7.976327s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:09 200 7.4438727s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:11 200 7.1381823s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:12 200 7.0600384s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:13 200 6.9949436s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:14 200 6.762957s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:15 200 6.6188368s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:16 200 6.4740212s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:17 200 6.195286s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:18 200 6.1771036s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:19 200 5.9856827s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:20 200 6.0185215s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:21 200 5.9181651s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:22 200 6.3720262s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:23 200 6.3308751s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:25 200 6.8858684s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:26 200 6.9752413s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:27 200 7.3388968s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:28 200 7.5232141s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:30 200 7.31199s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:31 200 7.6048389s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:32 200 7.3629131s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:33 200 7.3990033s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:34 200 7.210995s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:36 200 7.6824146s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:38 200 8.2571119s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:40 200 9.2572933s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:42 200 9.8789271s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:44 200 10.375689s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:45 200 10.8875558s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:46 200 10.270388s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:48 200 9.9592638s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:49 200 9.0773386s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:51 200 8.7055319s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:52 200 8.2702076s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:53 200 7.5729317s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:54 200 7.8001035s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:55 200 7.5164806s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:57 200 7.4826202s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:58 200 7.4239733s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:19:59 200 7.3506229s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:00 200 7.572677s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:02 200 7.3857033s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:03 200 7.2676374s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:04 200 7.0109754s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:05 200 6.5777878s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:06 200 6.4732552s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:07 200 6.2957537s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:08 200 6.7931261s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:09 200 6.699347s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:10 200 6.7477384s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:12 200 7.1138419s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:13 200 7.5433473s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:15 200 7.8136994s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:16 200 7.3908512s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:17 200 7.6768861s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:18 200 8.0026997s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:20 200 8.0904756s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:21 200 8.1648325s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:23 200 8.4562048s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:25 200 8.849332s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:26 200 8.767583s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:27 200 8.6074114s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:29 200 8.6082002s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:30 200 8.365024s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:31 200 8.4142039s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:33 200 7.8851951s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:34 200 8.2341428s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:36 200 8.4983469s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:37 200 8.2149459s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:38 200 8.2325566s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:41 200 9.4994457s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:43 200 10.8187632s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:45 200 10.7060157s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:46 200 10.7420418s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:48 200 11.0782877s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:49 200 11.2366024s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:51 200 9.6560128s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:52 500 8.4159006s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:53 200 8.5879625s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:55 200 8.4720829s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:56 200 8.6711762s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:20:58 200 8.9654633s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:00 200 9.040996s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:01 200 9.4159878s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:02 200 9.1164209s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:04 200 8.7750599s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:05 200 8.5899912s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:07 200 8.4500222s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:08 200 8.2032486s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:11 200 9.6971972s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:13 200 10.9362766s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:15 200 11.3392738s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:16 200 11.1955113s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:17 200 10.8229504s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:19 200 10.8013848s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:21 200 9.9286228s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:22 200 8.8656202s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:24 200 9.1166041s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:26 200 9.4927914s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:27 200 9.9996948s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:29 200 10.2756135s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:31 200 10.1058197s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:32 200 10.0431581s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:34 200 9.5779369s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:35 200 9.7233035s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:37 200 9.5846037s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:39 200 10.0893535s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:40 200 9.4214418s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:42 200 9.5153092s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:43 200 9.4605953s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:44 200 8.9338784s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:46 200 8.7963297s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:47 200 8.4213128s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:49 200 8.6479604s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:50 200 8.3350493s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:52 200 8.9921869s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:53 200 8.8427537s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:55 200 8.7516257s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:56 200 8.7878837s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:21:58 200 8.7751655s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:01 200 10.6005196s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:02 200 9.9476617s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:04 200 10.300865s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:05 200 10.3288302s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:08 200 11.2658288s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:09 200 11.2639s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:25 200 1.4430755s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:26 200 3.1363005s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:28 200 4.4721679s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:29 200 5.7690737s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:30 200 7.1195351s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:32 200 7.8823845s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:34 200 9.347041s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:35 200 8.9176656s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:39 200 11.6469465s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:41 200 12.1856913s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:42 200 12.197278s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:44 200 11.9421335s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:47 200 12.6207617s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:48 200 13.1220243s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:50 200 10.821402s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:52 200 10.5835662s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:53 200 10.7513407s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:55 200 10.6812072s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:56 200 9.923196s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:22:59 200 10.3625362s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:01 200 10.7308334s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:03 200 10.940601s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:04 200 10.8377686s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:05 200 10.3577773s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:07 200 10.3621028s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:08 200 9.6520532s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:10 200 9.0530387s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:12 200 9.015832s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:14 200 9.7584962s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:15 200 10.0290267s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:17 200 9.9692996s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:18 200 10.0892725s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:20 200 10.1528563s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:23 200 11.0526741s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:24 200 10.5524349s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:26 200 10.5774678s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:29 200 11.8185417s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:30 200 11.5504393s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:31 200 11.3642593s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:33 200 10.5844328s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:35 200 10.8103256s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:38 200 12.3933282s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:40 200 11.7631909s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:42 200 11.8753269s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:43 200 12.0261851s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:45 200 12.1683271s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:23:47 200 12.2522398s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:10 200 2.3257241s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:12 200 4.4644046s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:13 200 5.6849626s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:15 200 7.653123s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:17 200 9.2876408s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:19 200 10.3618476s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:21 200 10.9364357s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:23 200 10.7263405s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:24 200 10.9807019s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:27 200 11.2414062s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:28 200 11.4612383s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:30 200 10.8477296s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:31 200 10.8350865s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:33 200 10.4596222s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:35 200 11.0601181s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:37 200 10.8048929s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:39 200 10.6734775s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:41 200 10.9457273s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:42 200 11.0066175s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:44 200 11.0217516s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:46 200 10.2114201s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:48 200 10.5387823s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:50 200 10.5060607s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:51 200 10.4705529s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:53 200 10.9893205s ::1 POST "/v1/chat/completions"
[GIN] 2025/02/25 - 11:24:55 200 11.1050161s ::1 POST "/v1/chat/completions"
commented

I'm having a similar or same issue. Token generation slows by sometimes more than 1/2.