WasmEdge / WasmEdge

WasmEdge is a lightweight, high-performance, and extensible WebAssembly runtime for cloud native, edge, and decentralized applications. It powers serverless apps, embedded functions, microservices, smart contracts, and IoT devices.

Home Page:https://WasmEdge.org

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

feat: Support `top_p` and `presence_pennalty` configure in the ggml plugin

alabulei1 opened this issue · comments

Summary

This feature is requested from a user.

The popular web chatbot ui framework like ChatGPTNextWeb and Lobe Hub supports top_p and presence_pennalty . Because LlamaEdge doesn't support these two parameters, so the api server created by LlamaEdge can't be used in above two framework.

I think it'd better if we can support these two parameters

Details

Example:

-d '{

        "messages": [{"role": "user", "content": "李白 诗仙代表作"}],
        "model": "Mixtral-8x7B-Instruct-v0.1",
        "temperature": 0.7,
        "stream": true,
        "presence_penalty": 0,
        "frequency_penalty": 0,
        "top_p": 1

        }'

Appendix

No response

@dm4
Can you check if we have to support these options from the plugin side?