lrain-CN's starred repositories
4DGaussians
[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
AnimateZero
Official PyTorch implementation for the paper "AnimateZero: Video Diffusion Models are Zero-Shot Image Animators"
gaussian-splatting
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
aimoneyhunter
ai副业赚钱大集合,教你如何利用ai做一些副业项目,赚取更多额外收益。The Ultimate Guide to Making Money with AI Side Hustles: Learn how to leverage AI for some cool side gigs and rake in some extra cash. Check out the English version for more insights.
PhotoMaker
PhotoMaker [CVPR 2024]
pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
EfficientSAM
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
faster-whisper
Faster Whisper transcription with CTranslate2
opencompass
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Video-LLaVA
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
awesome-video-text-datasets
A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or video retrieval.
so-vits-svc
SoftVC VITS Singing Voice Conversion
EmotiVoice
EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
awesome-video-text-retrieval
A curated list of deep learning resources for video-text retrieval.
pytorch-image-models
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Awesome-Cross-Modal-Video-Moment-Retrieval
前沿论文持续更新--视频时刻定位 or 时域语言定位 or 视频片段检索。
Book1_Python-For-Beginners
Book_1_《编程不难》 | 鸢尾花书:从加减乘除到机器学习;请多多批评指正!
Test-Agent
Agent that empowers software testing with LLMs; industrial-first in China
Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
Image-Text-Matching-Summary
Summary of Related Research on Image-Text Matching