Sani's starred repositories
privateGPT
Interact with your documents using the power of GPT, 100% privately, no data leaks
supervision
We write your reusable computer vision tools. 💜
InternGPT
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
sherpa-onnx
Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter
Personalize-SAM
Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds
ChatWaifu_Mobile
移动版二次元 AI 老婆聊天器
whisper-ctranslate2
Whisper command line client compatible with original OpenAI client based on CTranslate2.
SD-CN-Animation
This script allows to automate video stylization task using StableDiffusion and ControlNet.
MEGABYTE-pytorch
Implementation of MEGABYTE, Predicting Million-byte Sequences with Multiscale Transformers, in Pytorch
PyTorch_Speaker_Verification
PyTorch implementation of "Generalized End-to-End Loss for Speaker Verification" by Wan, Li et al.
book-text-to-speech
A book about Text-to-Speech (TTS) in Chinese.
prompt-optimizer
Minimize LLM token complexity to save API costs and model computations.
efficientspeech
PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.
stable-diffusion-webui-daam
DAAM for Stable Diffusion Web UI
stable-diffusion-webui-metadata-marker
Stable diffusion WebUI extension. Renders generation information on the output image.