SFidea's repositories
LatentLumiere
a native attempt at reproducing Google Lumiere with stable diffusion t2i base model
4d-gaussian-splatting
[ICLR 2024] Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting
4DGaussians
4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
aframe-gaussian-splatting
Fork of aframe guassian splat to play with new features
Animate124
Animate124: Animating One Image to 4D Dynamic Scene
AnimateDiff
Official implementation of AnimateDiff.
Apple-Vision-Pro-UI-Kit
Free UI asset kit you can use to prototype and test interactive interfaces in Apple Vision Pro’s design system. Compatible with any XR headset with pass-through mode, including Meta Quest and Meta Quest Pro.
ChatLM-mini-Chinese
中文对话0.2B小模型(ChatLM-Chinese-0.2B),开源所有数据集来源、数据清洗、tokenizer训练、模型预训练、SFT指令微调、RLHF优化等流程的全部代码。支持下游任务sft微调。
Depth-Anything
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
dreamgaussian4d
[arXiv 2023] DreamGaussian4D: Generative 4D Gaussian Splatting
EdgeRealtimeVideoAnalytics
An example of using Redis Streams, RedisGears, RedisAI and RedisTimeSeries for Realtime Video Analytics (i.e. counting people)
Fay
Fay是一个完整的开源项目,包含Fay控制器及数字人模型,可灵活组合出不同的应用场景:虚拟主播、现场推销货、商品导购、语音助理、远程语音助理、数字人互动、数字人面试官及心理测评、贾维斯、Her。 开源项目,非产品试用!!!
fly-by
Procedurally generated terrain builder. A fly over experience.
frontend-park
🌸这是一个有趣的前端趣味知识公园~该项目是我平时捣鼓前端相关技术的一些案例集合。【涵盖:(Tensorflow.js-姿态识别,人脸识别),(WebRTC-音视频通话,录屏,虚拟背景,信令服务器),(Threejs-太阳系,3D 动画),(图片处理-千图成像,图片压缩,画板),(隐写术-文本隐写加密,图片隐写加密)等等...】
gligen-gui
An intuitive GUI for GLIGEN that uses ComfyUI in the backend
gsgen
Text-to-3D using Gaussian Splatting
infinigen
Infinite Photorealistic Worlds using Procedural Generation
Latte
The official implementation of Latte: Latent Diffusion Transformer for Video Generation.
MING
明医 (MING):中文医疗问诊大模型
NeRFCapture
An iOS app that collects/streams posed images for NeRFs using ARKit
Online3DViewer
A solution to visualize and explore 3D models in your browser.
PPHC
📙《高并发的哲学原理》开源图书(CC BY-NC-ND)https://pphc.lvwenhan.com
SegAnyGAussians
The official implementation of SAGA (Segment Any 3D GAussians)
speech-to-text
Real-time transcription using faster-whisper
SplaTAM
SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM
super-splat
3D Gaussian Splat Editor
talking-avatar-with-ai
This project is a digital human that can talk and listen to you. It uses OpenAI's GPT-3 to generate responses, OpenAI's Whisper to transcript the audio, Eleven Labs to generate voice and Rhubarb Lip Sync to generate the lip sync.
TalkingHead
Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.
threejs-learning
threejs+vue3实现数字孪生园区展示
Whisper
High-performance GPGPU inference of OpenAI's Whisper automatic speech recognition (ASR) model