marlinspike / tiny_llm_server

A self-hosted llm server, that can download and load TinyLlama or Phi-2 (or others), and serve them up via an HTTP endpoint