lrain-CN's starred repositories

marqo

Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai

Language:PythonLicense:Apache-2.0Stargazers:4405Issues:0Issues:0

4DGaussians

[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:2014Issues:0Issues:0

AnimateZero

Official PyTorch implementation for the paper "AnimateZero: Video Diffusion Models are Zero-Shot Image Animators"

Stargazers:345Issues:0Issues:0

MAG-Edit

MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance

Language:PythonStargazers:82Issues:0Issues:0

VGen

Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models

Language:PythonStargazers:2838Issues:0Issues:0

AnyDoor

Official implementations for paper: Anydoor: zero-shot object-level image customization

Language:PythonLicense:MITStargazers:3881Issues:0Issues:0

gaussian-splatting

Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"

Language:PythonLicense:NOASSERTIONStargazers:13234Issues:0Issues:0

aimoneyhunter

ai副业赚钱大集合,教你如何利用ai做一些副业项目,赚取更多额外收益。The Ultimate Guide to Making Money with AI Side Hustles: Learn how to leverage AI for some cool side gigs and rake in some extra cash. Check out the English version for more insights.

Stargazers:12767Issues:0Issues:0

PhotoMaker

PhotoMaker [CVPR 2024]

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:9185Issues:0Issues:0

VividTalk

VividTalk: One-Shot Audio-Driven Talking Head Generation Based on 3D Hybrid Prior

License:Apache-2.0Stargazers:757Issues:0Issues:0

InstantID

InstantID : Zero-shot Identity-Preserving Generation in Seconds 🔥

Language:PythonLicense:Apache-2.0Stargazers:10722Issues:0Issues:0

pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Language:Jupyter NotebookLicense:MITStargazers:5737Issues:0Issues:0

EfficientSAM

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:2034Issues:0Issues:0

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

Language:PythonLicense:BSD-2-ClauseStargazers:10550Issues:0Issues:0

faster-whisper

Faster Whisper transcription with CTranslate2

Language:PythonLicense:MITStargazers:10937Issues:0Issues:0

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonLicense:Apache-2.0Stargazers:3563Issues:0Issues:0

Video-LLaVA

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Language:PythonLicense:Apache-2.0Stargazers:2787Issues:0Issues:0

awesome-video-text-datasets

A curated list of video-text datasets in a variety of languages. These datasets can be used for video captioning (video description) or video retrieval.

License:MITStargazers:26Issues:0Issues:0

leptonai

A Pythonic framework to simplify AI service building

Language:PythonLicense:Apache-2.0Stargazers:2612Issues:0Issues:0

so-vits-svc

SoftVC VITS Singing Voice Conversion

Language:PythonLicense:AGPL-3.0Stargazers:25128Issues:0Issues:0

EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Language:PythonLicense:Apache-2.0Stargazers:7078Issues:0Issues:0

awesome-video-text-retrieval

A curated list of deep learning resources for video-text retrieval.

Stargazers:571Issues:0Issues:0

pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Language:PythonLicense:Apache-2.0Stargazers:31161Issues:0Issues:0

Cap4Video

【CVPR'2023 Highlight & TPAMI】Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?

Language:PythonLicense:MITStargazers:224Issues:0Issues:0

carrot

Free ChatGPT Site List 这儿为你准备了众多免费好用的ChatGPT镜像站点

Stargazers:16836Issues:0Issues:0

Awesome-Cross-Modal-Video-Moment-Retrieval

前沿论文持续更新--视频时刻定位 or 时域语言定位 or 视频片段检索。

Stargazers:218Issues:0Issues:0

Book1_Python-For-Beginners

Book_1_《编程不难》 | 鸢尾花书:从加减乘除到机器学习;请多多批评指正!

Language:Jupyter NotebookStargazers:4435Issues:0Issues:0

Test-Agent

Agent that empowers software testing with LLMs; industrial-first in China

Language:PythonLicense:NOASSERTIONStargazers:525Issues:0Issues:0

Chinese-CLIP

Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.

Language:PythonLicense:MITStargazers:4187Issues:0Issues:0

Image-Text-Matching-Summary

Summary of Related Research on Image-Text Matching

License:MITStargazers:55Issues:0Issues:0