Naiyuan Liu's starred repositories
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
generative-models
Generative Models by Stability AI
ultimatevocalremovergui
GUI for a Vocal Remover that uses Deep Neural Networks.
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
faster-whisper
Faster Whisper transcription with CTranslate2
AnimateDiff
Official implementation of AnimateDiff.
Douyin_TikTok_Download_API
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
TikTokDownloader
完全免费开源,基于 AIOHTTP 模块实现:TikTok 主页/视频/图集/原声;抖音主页/视频/图集/收藏/直播/原声/合集/评论/账号/搜索/热榜数据采集工具
video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
OutfitAnyone
Outfit Anyone: Ultra-high quality virtual try-on for Any Clothing and Any Person
IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
Person_reID_baseline_pytorch
:bouncing_ball_person: Pytorch ReID: A tiny, friendly, strong pytorch implement of person re-id / vehicle re-id baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial
YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
PySceneDetect
:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.
CTCDecoder
Connectionist Temporal Classification (CTC) decoding algorithms: best path, beam search, lexicon search, prefix search, and token passing. Implemented in Python.
PicImageSearch
整合图片识别 API,用于以图搜源 / Aggregator for Reverse Image Search API
llms_paper
该仓库主要记录 LLMs 算法工程师相关的顶会论文研读笔记(多模态、PEFT、小样本QA问答、RAG、LMMs可解释性、Agents、CoT)
clip_dinoiser
Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.