transformerlab / transformerlab-app

Experiment with Large Language Models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inference on Trained Mistral-7b fails often

aliasaria opened this issue · comments

log attached

message.txt

INFO:     127.0.0.1:59469 - "GET /server/worker_start?model_name=TransformerLab-mlx/MLX-Mistral-7B-Instruct-v0.2-1709360046_rivet&model_filename=/Users/timk/.transformerlab/workspace/models/MLX-Mistral-7B-Instruct-v0.2-1709360046_rivet&adaptor=&engine=mlx_server&experiment_id=1&parameters={%22inferenceEngine%22:%22mlx_server%22} HTTP/1.1" 200 OK
INFO:     127.0.0.1:59401 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59401 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59401 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59401 - "GET /model/get_conversation_template?model=TransformerLab-mlx/MLX-Mistral-7B-Instruct-v0.2-1709360046_rivet HTTP/1.1" 200 OK
INFO:     127.0.0.1:59401 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59401 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59401 - "GET /model/get_conversation_template?model=MLX-Mistral-7B-Instruct-v0.2-1709360046_rivet HTTP/1.1" 200 OK
INFO:     127.0.0.1:59401 - "GET /experiment/1/get_conversations HTTP/1.1" 200 OK
INFO:     127.0.0.1:59469 - "POST /v1/chat/count_tokens HTTP/1.1" 200 OK
INFO:     127.0.0.1:59469 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59469 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59469 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59469 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59469 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59469 - "POST /v1/chat/completions HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "POST /v1/chat/count_tokens HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/info HTTP/1.1" 200 OK
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/responses.py", line 277, in __call__
    await wrap(partial(self.listen_for_disconnect, receive))
  File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/responses.py", line 273, in wrap
    await func()
  File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/responses.py", line 250, in listen_for_disconnect
    message = await receive()
              ^^^^^^^^^^^^^^^
  File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 538, in receive
    await self.message_event.wait()
  File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/asyncio/locks.py", line 213, in wait
    await fut
asyncio.exceptions.CancelledError: Cancelled by cancel scope 2a5487510

During handling of the above exception, another exception occurred:

  + Exception Group Traceback (most recent call last):
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 408, in run_asgi
  |     result = await app(  # type: ignore[func-returns-value]
  |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py", line 84, in __call__
  |     return await self.app(scope, receive, send)
  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/fastapi/applications.py", line 289, in __call__
  |     await super().__call__(scope, receive, send)
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/applications.py", line 122, in __call__
  |     await self.middleware_stack(scope, receive, send)
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/middleware/errors.py", line 184, in __call__
  |     raise exc
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/middleware/errors.py", line 162, in __call__
  |     await self.app(scope, receive, _send)
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
  |     raise exc
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
  |     await self.app(scope, receive, sender)
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 20, in __call__
  |     raise e
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/fastapi/middleware/asyncexitstack.py", line 17, in __call__
  |     await self.app(scope, receive, send)
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/routing.py", line 718, in __call__
  |     await route.handle(scope, receive, send)
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/routing.py", line 276, in handle
  |     await self.app(scope, receive, send)
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/routing.py", line 69, in app
  |     await response(scope, receive, send)
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/responses.py", line 270, in __call__
  |     async with anyio.create_task_group() as task_group:
  |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 664, in __aexit__
  |     raise BaseExceptionGroup(
  | ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
    |     yield
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpcore/_async/http11.py", line 209, in _receive_event
    |     event = self._h11_state.next_event()
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/h11/_connection.py", line 469, in next_event
    |     event = self._extract_next_receive_event()
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/h11/_connection.py", line 419, in _extract_next_receive_event
    |     event = self._reader.read_eof()  # type: ignore[attr-defined]
    |             ^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/h11/_readers.py", line 204, in read_eof
    |     raise RemoteProtocolError(
    | h11._util.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)
    | 
    | The above exception was the direct cause of the following exception:
    | 
    | Traceback (most recent call last):
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpx/_transports/default.py", line 66, in map_httpcore_exceptions
    |     yield
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpx/_transports/default.py", line 249, in __aiter__
    |     async for part in self._httpcore_stream:
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpcore/_async/connection_pool.py", line 361, in __aiter__
    |     async for part in self._stream:
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpcore/_async/http11.py", line 337, in __aiter__
    |     raise exc
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpcore/_async/http11.py", line 329, in __aiter__
    |     async for chunk in self._connection._receive_response_body(**kwargs):
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpcore/_async/http11.py", line 198, in _receive_response_body
    |     event = await self._receive_event(timeout=timeout)
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpcore/_async/http11.py", line 208, in _receive_event
    |     with map_exceptions({h11.RemoteProtocolError: RemoteProtocolError}):
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/contextlib.py", line 158, in __exit__
    |     self.gen.throw(typ, value, traceback)
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    |     raise to_exc(exc) from exc
    | httpcore.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)
    | 
    | The above exception was the direct cause of the following exception:
    | 
    | Traceback (most recent call last):
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/responses.py", line 273, in wrap
    |     await func()
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/starlette/responses.py", line 262, in stream_response
    |     async for chunk in self.body_iterator:
    |   File "/Users/timk/.transformerlab/src/transformerlab/fastchat_openai_api.py", line 435, in chat_completion_stream_generator
    |     async for content in generate_completion_stream(gen_params):
    |   File "/Users/timk/.transformerlab/src/transformerlab/fastchat_openai_api.py", line 591, in generate_completion_stream
    |     async for raw_chunk in response.aiter_raw():
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpx/_models.py", line 990, in aiter_raw
    |     async for raw_stream_bytes in self.stream:
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpx/_client.py", line 146, in __aiter__
    |     async for chunk in self._stream:
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpx/_transports/default.py", line 248, in __aiter__
    |     with map_httpcore_exceptions():
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/contextlib.py", line 158, in __exit__
    |     self.gen.throw(typ, value, traceback)
    |   File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/site-packages/httpx/_transports/default.py", line 83, in map_httpcore_exceptions
    |     raise mapped_exc(message) from exc
    | httpx.RemoteProtocolError: peer closed connection without sending complete message body (incomplete chunked read)
    +------------------------------------
INFO:     127.0.0.1:59514 - "POST /api/v1/token_check HTTP/1.1" 400 Bad Request
INFO:     127.0.0.1:59514 - "POST /experiment/1/save_conversation HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /experiment/1/get_conversations HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "POST /v1/chat/count_tokens HTTP/1.1" 400 Bad Request
INFO:     127.0.0.1:59514 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59555 - "GET /server/info HTTP/1.1" 200 OK
INFO:     127.0.0.1:59514 - "GET /server/worker_healthz HTTP/1.1" 200 OK
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [28131]
FASTAPI LIFESPAN: Complete
Exception ignored in atexit callback: <function cleanup_at_exit at 0x29ca38040>
Traceback (most recent call last):
  File "/Users/timk/.transformerlab/src/api.py", line 321, in cleanup_at_exit
    worker_process.kill()
  File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/asyncio/subprocess.py", line 146, in kill
    self._transport.kill()
  File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/asyncio/base_subprocess.py", line 153, in kill
    self._check_proc()
  File "/Users/timk/miniconda3/envs/transformerlab/lib/python3.11/asyncio/base_subprocess.py", line 142, in _check_proc
    raise ProcessLookupError()
ProcessLookupError: 
🔴 Quitting spawned controller.
🔴 Quitting spawned workers.