oleglr / LGM

LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.

Home Page:https://me.kiui.moe/lgm/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Large Multi-View Gaussian Model

This is the official implementation of LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.

demo.mp4

Install

# xformers is required! please refer to https://github.com/facebookresearch/xformers for details.
# for example, we use torch 2.1.0 + cuda 18.1
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118
pip install -U xformers --index-url https://download.pytorch.org/whl/cu118

# a modified gaussian splatting (+ depth, alpha rendering)
git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization
pip install ./diff-gaussian-rasterization

# for mesh extraction
pip install git+https://github.com/NVlabs/nvdiffrast

# other dependencies
pip install -r requirements.txt

Pretrained Weights

Our pretrained weight can be downloaded from huggingface.

For example, to download the fp16 model for inference:

mkdir pretrained && cd pretrained
wget https://huggingface.co/ashawkey/LGM/resolve/main/model_fp16.safetensors
cd ..

For MVDream and ImageDream, we use a diffusers implementation. Their weights will be downloaded automatically.

Inference

Inference takes about 10GB GPU memory (loading all imagedream, mvdream, and our LGM).

### gradio app for both text/image to 3D
python app.py big --resume pretrained/model_fp16.safetensors

### test
# --workspace: folder to save output (*.ply and *.mp4)
# --test_path: path to a folder containing images, or a single image
python infer.py big --resume pretrained/model_fp16.safetensors --workspace workspace_test --test_path data_test 

### local gui to visualize saved ply
python gui.py big --output_size 800 --test_path workspace_test/saved.ply

### mesh conversion
python convert.py big --test_path workspace_test/saved.ply

For more options, please check options.

Training

NOTE: Since the dataset used in our training is based on AWS, it cannot be directly used for training in a new environment. We provide the necessary training code framework, please check and modify the dataset implementation!

# debug training
accelerate launch --config_file acc_configs/gpu1.yaml main.py big --workspace workspace_debug

# training (use slurm for multi-nodes training)
accelerate launch --config_file acc_configs/gpu8.yaml main.py big --workspace workspace

Acknowledgement

This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!

Citation

@article{tang2024lgm,
  title={LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation},
  author={Tang, Jiaxiang and Chen, Zhaoxi and Chen, Xiaokang and Wang, Tengfei and Zeng, Gang and Liu, Ziwei},
  journal={arXiv preprint arXiv:2402.05054},
  year={2024}
}

About

LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.

https://me.kiui.moe/lgm/

License:MIT License


Languages

Language:Python 99.6%Language:Shell 0.4%