TheGenerativeGeneration's repositories
Thin-Plate-Spline-Motion-Model
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.
4DGaussians
4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Depth-Anything-V2
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
gaussian_surfels
Implementation of the SIGGRAPH 2024 conference paper "High-quality Surface Reconstruction using Gaussian Surfels".
GLiNER
Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
gliner-finetune
A package for generating synthetic data and fine-tuning a gliner model.
gliner-spacy
A spaCy wrapper for GliNER
humannerf
HumanNeRF turns a monocular video of moving people into a 360 free-viewpoint video.
Instant-angelo
Instant-angelo: Build high-fidelity Digital Twin within 20 Minutes!
InternGPT
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
MobileR2L
[CVPR 2023] Real-Time Neural Light Field on Mobile Devices
nerfstudio
A collaboration friendly studio for NeRFs
Re-ReND
Re-ReND: Real-time Rendering of NeRFs across Devices
Real-ESRGAN
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
RobustVideoMatting
Robust Video Matting in PyTorch, TensorFlow, TensorFlow.js, ONNX, CoreML!
stable-dreamfusion
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
supersplat
3D Gaussian Splat Editor
torch-merf
An unofficial pytorch implementation of MeRF
UnityGaussianSplatting
Toy Gaussian Splatting visualization in Unity
unsloth
Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
VIBE
Official implementation of CVPR2020 paper "VIBE: Video Inference for Human Body Pose and Shape Estimation"
vid2avatar
Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition (CVPR2023)
whisper_streaming
Whisper realtime streaming for long speech-to-text transcription and translation
wtpsplit
Toolkit to segment text into sentences or other semantic units in a robust, efficient and adaptable way.