wqdong8

followers

following

stars

Zhejiang University

Hangzhou, China

wqdong8's starred repositories

nvTorchCam

Language:PythonApache-2.03700

lectures

Material for cuda-mode lectures

Language:Jupyter NotebookApache-2.0202800

cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Language:PythonApache-2.0162100

LAS-Diffusion

Language:PythonMIT21200

GeoLRM

Geometry-Aware Large Reconstruction Model for Efficient and High-Quality 3D Generation

Language:PythonApache-2.08500

DiffSynth-Studio

Enjoy the magic of Diffusion models!

Language:PythonApache-2.0604900

StableNormal

StableNormal: Reducing Diffusion Variance for Stable and Sharp Normal

Language:PythonApache-2.011700

LLM101n

LLM101n: Let's build a Storyteller

rectified_flow_prior

Official code for paper: Text-to-Image Rectified Flow as Plug-and-Play Priors

Language:Python5600

hallo

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Language:PythonMIT767400

XCube

[CVPR 2024 Highlight] XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies

Language:PythonNOASSERTION18100

3Doodle

Official implementation of 3Doodle: Compact Abstraction of Objects with 3D Strokes (SIGGRAPH 24', Journal track)

Language:Python4600

NKSR

[CVPR 2023 Highlight] Neural Kernel Surface Reconstruction

Language:PythonNOASSERTION71700

OHTA

[CVPR 2024] OHTA: One-shot Hand Avatar via Data-driven Implicit Priors

Language:PythonMIT1800

Recap-DataComp-1B

This is the official repository of our paper "What If We Recaption Billions of Web Images with LLaMA-3 ?"

MeshAnything

From anything to mesh like human artists. Official impl. of "MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers"

Language:PythonNOASSERTION184300

M-LRM

MIT3900

UniDepth

Universal Monocular Metric Depth Estimation

Language:PythonNOASSERTION49900

Coverage_Axis

Official code for the paper Coverage Axis: Inner Point Selection for 3D Shape Skeletonization, Eurographics 2022.

Language:C++7500

LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Language:PythonMIT110400

GeoWizard

[ECCV'24] GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Language:Python65800

masa

Official Implementation of CVPR24 highligt paper: Matching Anything by Segmenting Anything

Language:PythonApache-2.090800

direct3d

Language:PythonMIT4600

Physics3D

Official implementation of Physics3D: Learning Physical Properties of 3D Gaussians via Video Diffusion

Language:PythonMIT12800

EMAP

[CVPR'24] 3D Neural Edge Reconstruction

Language:PythonMIT13700

streamv2v

Official Pytorch implementation of StreamV2V.

Language:PythonNOASSERTION41100

3DHighlighter

Localizing Regions on 3D Shapes via Text Descriptions

Language:Python9700

MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Language:PythonApache-2.0816000

Reason3D-PyTorch

Reasoning 3D Segmentation - "segment anything"/grounding/part seperation in 3D with natural conversations.

Language:PythonNOASSERTION7300

swift

ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Language:PythonApache-2.0270100