setting to configure completions API endpoint
khimaros opened this issue · comments
i'm using llama-cpp-python[server]
to run an OpenAI compatible chat/completions endpoint. it runs at http://127.0.0.1:8000/v1
generally. this is working well with other (desktop) clients such as BetterChatGPT