This page contains a collection of opensource AI models and tools available for various use cases
- Summary:
- Resources
- Projects
- apple/ml-stable-diffusion - Port for Apple Silicon + CoreML
- fast-stable-diffusion - fast-stable-diffusion, +25-50% speed increase + memory efficient + DreamBooth
- Lsmith - StableDiffusionWebUI accelerated using TensorRT
- ControlNet - copy compositions or human poses from a reference image
- imaginAIry - Github - AI imagined images. Pythonic generation of stable diffusion images.
- Summary: Marrying Grounding DINO with Segment Anything & Stable Diffusion & BLIP - Automatically Detect , Segment and Generate Anything with Image and Text Inputs
- Resources:
- Projects
- Semantic-SAM - Segment and Recognize Anything at Any Granularity
- Summary: Combine static images with motion dynamics
- Resources:
- Summary: Create photos/paintings/avatars of anyone in any style within seconds
- Resources:
- Summary
- Resources:
- Summary: Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.
- Resources
- Summary: YOLOv8 in PyTorch > ONNX > CoreML > TFLite. Can do detection, segmentation and much more.
- Resources
- Summary: 2D and 3D Face alignment library build using pytorch
- Resources
- Summary: high quality object masks from input prompts such as points or boxes
- Resources:
- Projects
- Summary: A Detector with image classes that can use image-level labels to easily train detectors, detects any given class names
- Resources:
- Summary: high-performance visual features that can be directly employed with classifiers as simple as linear layers on a variety of computer vision tasks
- Resources:
- Summary: Tracking Anything in High Quality
- Resources
- Summary: a lightweight feature matcher with high accuracy and blazing fast inference
- Resources:
- Summary: Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.
- Resources
- Projects:
- distil-whisper - 6x faster, 50% smaller, within 1% word error rate.
- Talk to your multi-lingual AI assistant - Uses Whispher, GPT-3 and Coqui-TTS
- Transcribe Youtube Video to text with OpenAI Whispher - YouTube - Using pytube and whispher
- whisper.cpp - Port in C/C++, runs in CPU including mobile and rpi.
- Whisper - High-performance GPGPU inference for Windows
- whispherX - Timestamp-Accurate Automatic Speech Recognition using Force Alignment
- faster-whispher - Faster Whisper transcription with CTranslate2
- whispher-jax - optimised JAX code Whisper
- Summary: BLOOM is an autoregressive Large Language Model (LLM), trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. As such, it is able to output coherent text in 46 languages and 13 programming languages that is hardly distinguishable from text written by humans. BLOOM can also be instructed to perform text tasks it hasn't been explicitly trained for, by casting them as text generation tasks.
- Resources
- Introducing The World’s Largest Open Multilingual Language Model: BLOOM - Blog
- BLOOM Model Card - Huggingface (License: Responsible AI License)
- tr11-176B-ml - Github
- Projects
- bloomz.cpp - C++ implementation for BLOOM Inference
- Summary: A general-purpose scientific language model. It is trained on a large corpus of scientific text and data. It can perform scientific NLP tasks at a high level, as well as tasks such as citation prediction, mathematical reasoning, molecular property prediction and protein annotation.
- Resources
- Galactica online demo
- Galactica: A Large Language Model for Science - Paper
- galai - Github (License: Code - Apache 2.0, Model - CCA-NC4.0-PIL)
- Summary: a variant forked off GPT-J (6B), and performs exceptionally well on text classification and other tasks
- Resources
Summary: A language model trained on biomedical literature which delivers an improved state of the art for medical question answering.
- Summary: The simplest, fastest repository for training/finetuning medium-sized GPTs
- Resources
- nanoGPT - Github (License: MIT)
- Summary: Run 100B+ language models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
- Resources:
- Summary: ChatRWKV is like ChatGPT but powered by the RWKV (100% RNN) language model, and open source.
- Resources:
- Summary: Large Language Model Meta AI
- Resources:
- Projects
- open_llama - a permissively licensed open source reproduction
- LLaMa - facebookresearch - Minimal project for inference
- llama.cpp - Inference with C/C++
- dalai - The simplest way to run LLaMA on your local machineml
- llama-rs - Run LLaMA inference on CPU, with Rust
- alpaca-lora - Instruct-tune LLaMA on consumer hardware
- vicuna - an open-source chatbot trained by fine-tuning LLaMA
- FastChat - Github
- ChatDoctor
- lit-LLaMA - Implementation of the LLaMA language model based on nanoGPT (Commercial Use)
- Open-Llama - Train Llama model
- open_llama - OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
- llama2.c - Inference Llama 2 in one file of pure C
- llama-dfdx - LLaMa 7b with CUDA acceleration implemented in rust. Minimal GPU memory needed!
- llama2.mojo - Inference Llama 2 in one file of pure
- Summary: LLM for research and commercial purposes. Allows commercial use upto $1M revenue.
- Resources:
- Summary: Data-Centric FinGPT. Open-source for open finance!
- Resources
- Summary: Open-source LLM free for research and commercial\
- Resources
- Projects
- Llama2-Onnx - an optimized version of the Llama 2 model
- llama-recipes - Examples and recipes for Llama 2 model
- Summary: 7B model with Apache license, commercial use
- Resources
- Announcing Mistral 7B - Blog post
- mistral-src - Inference code
- Huggingface - Model in hub
- Summary: Family of (4) SOTA LLMs (2B/7B x Base/Instruction) by Google
- Resources
- Projects
- gemma.cpp - lightweight, standalone C++ inference engine for Google's Gemma models
- Summary: Model that can support upto 8K context length
- Resources
- Summary: An open source implementation of CLIP.
- Resources:
- Summary: a novel state-of-the-art open-source text-to-image model with a high degree of photorealism and language understanding
- Resources:
- Summary: Efficient Multimodal Large Language Model via Small Backbones. Requires a 24G GPU for training and an 8G GPU or CPU for inference.
- Resources:
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones - Research paper
- TinyGPT-V - Github - Code
- Summary:
- Resources:
- LLaVA: Large Language and Vision Assistant - Project page
- Demo
- Research papers
- Visual Instruction Tuning (NeurIPS 2023 Oral)
- Improved Baselines with Visual Instruction Tuning (LLaVa 1.5)
- LLaVa - Github
- Summary: a tiny (1.6B) vision language model that kicks ass and runs anywhere
- Resources:
- Summary: 1M context length open model for long video and audio understanding
- Resources:
- Summary: A deep learning toolkit for Text-to-Speech, battle-tested in research and production.
- Resources
- Summary: A multi-voice TTS system trained with an emphasis on quality
- Resources
- Summary: Understanding and Generating Speech, Music, Sound, and Talking Head
- Resources:
- Summary: Text-Prompted Generative Audio Model
- Resources
- Summary: a powerful and modern open-source text-to-speech engine. EmotiVoice speaks both English and Chinese, and with over 2000 different voices. The most prominent feature is emotional synthesis, allowing you to create speech with a wide range of emotions, including happy, excited, sad, angry and others.
- Resources
- Summary: High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.
- Resources:
- Summary: An open-source deep-learning toolkit for training and deploying speech-to-text models.
- Resources:
- Summary: Attention network for tabular data
- Resources
- Summary: Instant neural graphics primitives: lightning fast NeRF and more
- Resources
- instant-ngp (License: NVIDIA Custom License)
- Getting started with NVIDIA Instant NeRFs
- Summary: Generate 3D objects conditioned on text or images
- Resources
- Summary:
- Resources:
- Summary: Building applications with LLMs through composability
- Resources:
- Projects
- langflow - LangFlow is a UI for LangChain
- flowise - Drag & drop UI to build your customized LLM flow using LangchainJS
- awesome-langchain - Awesome list of tools and projects with the awesome LangChain framework
- Summary: Build and control your own LLMs
- Resources:
- Summary: Self-hosted, community-driven simple local OpenAI-compatible API written in go
- Resources:
- Summary: The LLM engine for rapidly customizing models. Allows commercial use!
- Resources:
- Summary: One-stop Transformer Library for State-of-the-art Code LLM
- Resources
- Summary: Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
- Resources
- Summary: Open-source large language models that run locally on your CPU and nearly any GPU
- Resources:
- Summary: OpenChatKit provides a powerful, open-source base to create both specialized and general purpose models for various applications. The kit includes an instruction-tuned language models, a moderation model, and an extensible retrieval system for including up-to-date responses from custom repositories
- Resources:
- Summary: A React and Electron-based app that executes the FreedomGPT LLM locally (offline and private) on Mac and Windows using a chat-based interface (based on Alpaca Lora)
- Resources
- Summary: Open Assistant is a project meant to give everyone access to a great chat based large language model.
- Resources:
- Summary: A dev-first open source autonomous AI agent framework. Enabling developers to build, manage & run useful autonomous agents quickly and reliably.
- Resources
- Summary: A fast inference library for running LLMs locally on modern consumer-class GPUs
- Resources
- Summary: a local knowledge base question-answering system designed to support a wide range of file formats and databases, allowing for offline installation and use
- Resources:
- Summary: Providing enterprise-grade LLM-based development framework, tools, and fine-tuned models.
- Resources:
- Summary: a library for performing segmentation of objects in images and videos
- Resources
- Summary: A Pipeline-Level Solution for Real-Time Interactive Generation
- Resources
- Summary: We write your reusable computer vision tools.
- Resources:
- Summary: InvokeAI is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media
- Resources:
- Summary: one-click face swap
- Resources:
- Summary: ShortGPT is a powerful framework for automating content creation. It simplifies video creation, footage sourcing, voiceover synthesis, and editing tasks.
- Resources:
- Summary: a voice assistant made as an experiment using neural networks with Rust
- Resources:
- Summary - TaskMatrix connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting.
- Resources
- Summary: Multi modal AI agent
- Resources
- Summary: Enhanced ChatGPT Clone: Features OpenAI, GPT-4 Vision, Bing, Anthropic, OpenRouter, Google Gemini, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. More features in development
- Resources:
- Summary: A natural language interface for computers
- Resources:
- Summary: The open-source tool for building high-quality datasets and computer vision models
- Resources:
- ColossalAI - Making large AI models cheaper, faster and more accessible
- monai - medical imaging with deep learning
- supervision - We write your reusable computer vision tools
- SpeechBrain - An Open-Source Conversational AI Toolkit
- OpenNMT - An open source neural machine translation system
- outlines - Neuro Symbolic Text Generation
- llm-foundry - LLM training code for MosaicML foundation models
- chainlit - Build Python LLM apps in minutes!
- languagemodels - Explore large language models on any computer with 512MB of RAM
- lit-gpt - Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
- rasa - Open source machine learning framework to automate text- and voice-based conversations
- RasaGPT - headless LLM chatbot platform
- modelzoo.co - Discover open source deep learning code and pretrained models.
- OpenVINO Model Zoo - Model zoo from multiple sources
- replicate - easy to use setup for popular models
- modelscope - bring the notion of Model-as-a-Service to life
- https://civitai.com/
- open-llms - A list of open LLMs available for commercial use.
- AI Product Index - A curated index to track AI-powered products.
- awesome-generative-ai - A curated list of modern Generative Artificial Intelligence projects and services
- LinkedIn Post - Commercial use LLMs - List of commercially usable LLMs
- ai-collection - A Collection of Awesome Generative AI Applications
- tuning-playbook - A playbook for systematically maximizing the performance of deep learning models.
- ollama - Get up and running with large language models, locally.
- inference - Replace OpenAI GPT with another LLM in your app by changing a single line of code
- llama-embeddings-fastapi-service - designed to facilitate and optimize the process of obtaining text embeddings using different LLMs