| Documentation | Blog | Discord | Roadmap |
EdgenChat, a local chat app powered by ⚡Edgen
- OpenAI Compliant API: ⚡Edgen implements an OpenAI compatible API, making it a drop-in replacement.
- Multi-Endpoint Support: ⚡Edgen exposes multiple AI endpoints such as chat completions (LLMs) and speech-to-text (Whisper) for audio transcriptions.
- Model Agnostic: LLMs (Llama2, Mistral, Mixtral...), Speech-to-text (whisper) and many others.
- Optimized Inference: You don't need to take a PhD in AI optimization. ⚡Edgen abstracts the complexity of optimizing inference for different hardware, platforms and models.
- Modular: ⚡Edgen is model and runtime agnostic. New models can be added easily and ⚡Edgen can select the best runtime for the user's hardware: you don't need to keep up about the latest models and ML runtimes - ⚡Edgen will do that for you.
- Model Caching: ⚡Edgen caches foundational models locally, so 1 model can power hundreds of different apps - users don't need to download the same model multiple times.
- Native: ⚡Edgen is built in 🦀Rust and is natively compiled to all popular platforms: Windows, MacOS and Linux. No docker required.
- Graphical Interface: A graphical user interface to help users efficiently manage their models, endpoints and permissions.
⚡Edgen lets you use GenAI in your app, completely locally on your user's devices, for free and with data-privacy. It's a drop-in replacement for OpenAI (it uses the a compatible API), supports various functions like text generation, speech-to-text and works on Windows, Linux, and MacOS.
- Session Caching: ⚡Edgen maintains top performance with big contexts (big chat histories), by caching sessions. Sessions are auto-detected in function of the chat history.
- GPU support: CUDA, Vulkan. Metal coming soon
- [Chat] Completions
- [Audio] Transcriptions
- [Embeddings] Embeddings #41
- [Image] Generation
- [Chat] Multimodal chat completions
- [Audio] Speech
Check in the documentation
- Windows
- Linux
- MacOS
-
Data Private: On-device inference means users' data never leave their devices.
-
Scalable: More and more users? No need to increment cloud computing infrastructure. Just let your users use their own hardware.
-
Reliable: No internet, no downtime, no rate limits, no API keys.
-
Free: It runs locally on hardware the user already owns.
Ready to start your own GenAI application? Checkout our guides!
⚡Edgen usage:
Usage: edgen [<command>] [<args>]
Toplevel CLI commands and options. Subcommands are optional. If no command is provided "serve" will be invoked with default options.
Options:
--help display usage information
Commands:
serve Starts the edgen server. This is the default command when no
command is provided.
config Configuration-related subcommands.
version Prints the edgen version to stdout.
oasgen Generates the Edgen OpenAPI specification.
edgen serve
usage:
Usage: edgen serve [-b <uri...>] [-g]
Starts the edgen server. This is the default command when no command is provided.
Options:
-b, --uri if present, one or more URIs/hosts to bind the server to.
`unix://` (on Linux), `http://`, and `ws://` are supported.
For use in scripts, it is recommended to explicitly add this
option to make your scripts future-proof.
-g, --nogui if present, edgen will not start the GUI; the default
behavior is to start the GUI.
--help display usage information
⚡Edgen also supports compilation and execution on a GPU, when building from source, through Vulkan and CUDA. The following cargo features enable the GPU:
llama_vulkan
- execute LLM models using Vulkan. Requires a Vulkan SDK to be installed.llama_cuda
- execute LLM models using CUDA. Requires a CUDA Toolkit to be installed.whisper_cuda
- execute Whisper models using CUDA. Requires a CUDA Toolkit to be installed.
Note that, at the moment, llama_vulkan
and llama_cuda
cannot be enabled at the same time.
Example usage (building from source, you need to first install the prerequisites):
cargo run --features llama_vulkan --release -- serve
If you don't know where to start, check Edgen's roadmap! Before you start working on something, see if there's an existing issue/pull-request. Pop into Discord to check with the team or see if someone's already tackling it.
- Edgen Discord server: Real time discussions with the ⚡Edgen team and other users.
- GitHub issues: Feature requests, bugs.
- GitHub discussions: Q&A.
- Blog: Big announcements.
llama.cpp
,whisper.cpp
, andggml
for being an excellent getting-on point for this space.