zty's starred repositories
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
insightface
State-of-the-art 2D and 3D Face Analysis Project
audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
PhotoMaker
PhotoMaker [CVPR 2024]
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
FlagEmbedding
Retrieval and Retrieval-augmented LLMs
awesome-pretrained-chinese-nlp-models
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
midjourney-proxy
代理 MidJourney 的discord频道,实现api形式调用AI绘图
YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
PySceneDetect
:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.
sd-webui-deforum
Deforum extension for AUTOMATIC1111's Stable Diffusion webui
BCEmbedding
Netease Youdao's open-source embedding and reranker models for RAG products.
TransNetV2
TransNet V2: Shot Boundary Detection Neural Network
SunoSongsCreator
About High quality songs generation by https://www.suno.ai/. Reverse engineered API.
whisper-onnx-tensorrt
ONNX and TensorRT implementation of Whisper