Naiyuan Liu's starred repositories
GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Person_reID_baseline_pytorch
:bouncing_ball_person: Pytorch ReID: A tiny, friendly, strong pytorch implement of person re-id / vehicle re-id baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial
T2I-Adapter
T2I-Adapter
speech-to-speech
Speech To Speech: an effort for an open-sourced and modular GPT4-o
recognize-anything
Open-source and strong foundation image recognition models.
occupancy_networks
This repository contains the code for the paper "Occupancy Networks - Learning 3D Reconstruction in Function Space"
Grounded-SAM-2
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
BaiduImageSpider
一个超级轻量的百度图片爬虫
PicImageSearch
整合图片识别 API,用于以图搜源 / Aggregator for Reverse Image Search API
clip_dinoiser
Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.
TextGenerator
OCR dataset Text-Detection dataset Font-Classification dataset generator
ocr_synth_text_chinese
生成训练文本检测数据集