A self-hosted llm server, that can download and load TinyLlama or Phi-2 (or others), and serve them up via an HTTP endpoint
Repository from Github https://github.commarlinspike/tiny_llm_serverRepository from Github https://github.commarlinspike/tiny_llm_server