LlamaEdge / LlamaEdge

The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge

Home Page:https://llamaedge.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Feature Request: Simplify the run-llm.sh script interactions

alabulei1 opened this issue · comments

Summary

The current version of run-llm.sh asks the users to read a lot of text and make a lot of decisions, which is very hard for users who are not familiar with basic Wasm and LLM terminologies. We would like to propose to simplify the script interactions as follows.

Scenario 1: The user runs the script with no options. It will use the following defaults and NOT ask the user ANY question.

  • Install WasmEdge with GGML if not previously installed
  • Download the latest llama-api-server.wasm app if not previously downloaded
  • Download the Gemma-2b model file if it is not previously downloaded.
  • Start the API server
  • Launch the browser to http://localhost:8080

Scenario 2: The user runs the script with a specified model name. (e.g., --model llama2-7b-chat). It will use the following defaults and NOT ask the user ANY question.

  • Install WasmEdge with GGML if not previously installed
  • Download the latest llama-api-server.wasm app if not previously downloaded
  • Download the specified model file if the model file is not previously downloaded. It should print a list of model names if the model name is unknown.
  • Start the API server
  • Launch the browser to http://localhost:8080

Scenario 3: The user runs the script with the --interactive flag. It will ask all questions as it does now