bug:
zyxcambridge opened this issue · comments
Summary
abstract_summary = abstract_summary_extraction(transcription)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/zhangyixin/Desktop/MeetingSummary/openai_meetging_summary.py", line 36, in abstract_summary_extraction
response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/zhangyixin/anaconda3/lib/python3.11/site-packages/openai/_utils/_utils.py", line 299, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/zhangyixin/anaconda3/lib/python3.11/site-packages/openai/resources/chat/completions.py", line 594, in create
return self._post(
^^^^^^^^^^^
File "/Users/zhangyixin/anaconda3/lib/python3.11/site-packages/openai/_base_client.py", line 1055, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/zhangyixin/anaconda3/lib/python3.11/site-packages/openai/_base_client.py", line 834, in request
return self._request(
^^^^^^^^^^^^^^
File "/Users/zhangyixin/anaconda3/lib/python3.11/site-packages/openai/_base_client.py", line 899, in _request
return self._process_response(
^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/zhangyixin/anaconda3/lib/python3.11/site-packages/openai/_base_client.py", line 506, in _process_response
return api_response.parse()
^^^^^^^^^^^^^^^^^^^^
File "/Users/zhangyixin/anaconda3/lib/python3.11/site-packages/openai/_response.py", line 59, in parse
parsed = self._parse()
^^^^^^^^^^^^^
File "/Users/zhangyixin/anaconda3/lib/python3.11/site-packages/openai/_response.py", line 175, in parse
content_type, * = response.headers.get("content-type").split(";")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'split'
Reproduction steps
curl -X POST http://0.0.0.0:8080/v1/chat/completions -H 'accept:application/json' -H 'Content-Type: application/json' -d '{"messages":[{"role":"system", "content":"You are a helpful AI assistant"}, {"role":"user", "content":"What is the capital of France?"}], "model":"Yi-34B-Chat"}'
Screenshots
Any logs you want to share for showing the specific issue
wasmedge --dir .:. --nn-preload default:GGML:AUTO:TheBloke/Yi-34B-Chat-GGUF/yi-34b-chat.Q8_0.gguf llama-api-server.wasm -p chatml
[INFO] Socket address: 0.0.0.0:8080
[INFO] Model name: default
[INFO] Model alias: default
[INFO] Prompt context size: 512
[INFO] Number of tokens to predict: 1024
[INFO] Number of layers to run on the GPU: 100
[INFO] Batch size for prompt processing: 512
[INFO] Temperature for sampling: 0.8
[INFO] Penalize repeat sequence of tokens: 1.1
[INFO] Prompt template: ChatML
[INFO] Log prompts: false
[INFO] Log statistics: false
[INFO] Log all information: false
[INFO] Starting server ...
[INFO] Plugin version: b1953 (commit 6f9939d1)
[INFO] Listening on http://0.0.0.0:8080
[WARNING] The prompt is too long. Please reduce the length of your input and try again.
[WARNING] The prompt is too long. Please reduce the length of your input and try again.
[WARNING] The prompt is too long. Please reduce the length of your input and try again.
[WARNING] The prompt is too long. Please reduce the length of your input and try again.
Model Information
yi34
Operating system information
mac m1
ARCH
arm
CPU Information
m1
Memory Size
64
GPU Information
m1
VRAM Size
64
@zyxcambridge The command you provided is not correct. There should be a reverse prompt. Please refer to the following command:
wasmedge --dir .:. --nn-preload default:GGML:AUTO:yi-34b-chat.Q5_K_M.gguf llama-api-server.wasm -p chatml --reverse-prompt '<|im_end|>' --ctx-size 2048