Running scripts/sampling/simple_video_sample.py with sv3d_u terminates with no error and no outputs

Question

Running scripts/sampling/simple_video_sample.py with sv3d_u terminates with no error and no outputs

horenbergerb opened this issue 2 months ago · comments

Running the following command:

python scripts/sampling/simple_video_sample.py --version sv3d_u --verbose --decoding_t 1

The program starts, but then it seems to just die suddenly with no warnings near the very end of inference:

...
VideoTransformerBlock is using checkpointing
Initialized embedder #0: FrozenOpenCLIPImagePredictionEmbedder with 683800065 params. Trainable: False
Initialized embedder #1: VideoPredictionEmbedderWithEncoder with 83653863 params. Trainable: False
Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False
Restored from checkpoints/sv3d_u.safetensors with 0 missing and 0 unexpected keys
##############################  Sampling setting  ##############################
Sampler: EulerEDMSampler
Discretization: EDMDiscretization
Guider: TrianglePredictionGuider
Sampling with EulerEDMSampler for 51 steps:   0%|                                                                                                                    | 0/51 [00:00<?, ?it/s]/home/captdishwasher/horenbergerb/sv3d/generative-models/.pt2/lib/python3.10/site-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
Sampling with EulerEDMSampler for 51 steps:  98%|████████████████████████████████████████████████████████████████████████████████████████████████████████▉  | 50/51 [01:31<00:01,  1.83s/it]
(.pt2) (base) captdishwasher@captainofthedishwasher-MS-7D43:~/horenbergerb/sv3d/generative-models$

Sometimes the output folder is empty, and sometimes it contains a single jpg and a ~20mb mp4 which is corrupted and won't play.

Is there something I'm missing? Is this a VRAM problem? I'm running 24GB of VRAM on an NVIDIA 3090, Ubuntu 22.04.

Thanks!

Eli Halpern · Answer 1 · Tue Mar 19 2024 09:54:51 GMT+0800 (China Standard Time)

Experiencing the same issue!

Beau Horenberger · Answer 2 · Tue Mar 19 2024 10:00:11 GMT+0800 (China Standard Time)

I am having more success using streamlit. I'm getting output images and mp4s.

winintel.com · Answer 3 · Tue Mar 19 2024 10:42:03 GMT+0800 (China Standard Time)

python scripts/sampling/simple_video_sample.py --input_path images/pig.png --version sv3d_p --elevations_deg 10.0
no module 'xformers'. Processing without...
no module 'xformers'. Processing without...
Traceback (most recent call last):
File "E:\dl\generative-models\scripts\sampling\simple_video_sample.py", line 36, in
elevations_deg: Optional[float | List[float]] = 10.0, # For SV3D
TypeError: unsupported operand type(s) for |: 'type' and '_GenericAlias'

Beau Horenberger · Answer 4 · Tue Mar 19 2024 20:13:13 GMT+0800 (China Standard Time)

python scripts/sampling/simple_video_sample.py --input_path images/pig.png --version sv3d_p --elevations_deg 10.0 no module 'xformers'. Processing without... no module 'xformers'. Processing without... Traceback (most recent call last): File "E:\dl\generative-models\scripts\sampling\simple_video_sample.py", line 36, in elevations_deg: Optional[float | List[float]] = 10.0, # For SV3D TypeError: unsupported operand type(s) for |: 'type' and '_GenericAlias'

You're not using Python3.10. Recreate your venv with Python3.10 and that issue will go away

megeek · Answer 5 · Wed Mar 20 2024 07:03:44 GMT+0800 (China Standard Time)

I'm also having the same issue...running python 3.10.4 and NVIDIA 4090 on Ubuntu 22.04.1

winintel.com · Answer 6 · Wed Mar 20 2024 09:46:07 GMT+0800 (China Standard Time)

python scripts/sampling/simple_video_sample.py --input_path images/pig.png --version sv3d_p --elevations_deg 10.0 no module 'xformers'. Processing without... no module 'xformers'. Processing without... Traceback (most recent call last): File "E:\dl\generative-models\scripts\sampling\simple_video_sample.py", line 36, in elevations_deg: Optional[float | List[float]] = 10.0, # For SV3D TypeError: unsupported operand type(s) for |: 'type' and '_GenericAlias'

You're not using Python3.10. Recreate your venv with Python3.10 and that issue will go away

right, but after that, and after I change to cuda12, reinstall pytorch and other version package, run it still raise a lot of errors

qixuanwang-233 · Answer 7 · Wed Mar 20 2024 12:08:16 GMT+0800 (China Standard Time)

Hello, have you solved the problem? I encountered the same problem and the output is a damaged video. QWQ

yykani · Answer 8 · Wed Mar 20 2024 15:10:06 GMT+0800 (China Standard Time)

I encountered the same issue when I run the program with Python 3.10.12 on WSL on Windows11.
Generated things are 000000.jpg and 000000.mp4. I could open the jpg file correctly.
Can anyone tell me what the cause is from the following error?

xxx@xxx:/mnt/c/Users/xxx/generative-models$ python scripts/sampling/simple_video_sample.py --input_path "/mnt/c/Users/xxx/Downloads/00001-2141331527_2.png" --version sv3d_u
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
VideoTransformerBlock is using checkpointing
Initialized embedder #0: FrozenOpenCLIPImagePredictionEmbedder with 683800065 params. Trainable: False
Initialized embedder #1: VideoPredictionEmbedderWithEncoder with 83653863 params. Trainable: False
Initialized embedder #2: ConcatTimestepEmbedderND with 0 params. Trainable: False
Restored from checkpoints/sv3d_u.safetensors with 0 missing and 0 unexpected keys
/mnt/c/Users/xxx/generative-models/.pt2/lib/python3.10/site-packages/torch/utils/checkpoint.py:31: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")

xxx@xxx:/mnt/c/Users/xxx/generative-models$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:        22.04
Codename:       jammy

yykani · Answer 9 · Wed Mar 20 2024 17:52:30 GMT+0800 (China Standard Time)

By running it with streamlit, I could generate a frame image when rotating an input image as a 3D model. The video is still corrupted, but you can make a video if you join the frame images yourself.

yykani · Answer 10 · Wed Mar 20 2024 18:22:56 GMT+0800 (China Standard Time)

I fixed it by installing imageio-ffmpeg & pyav as @timofimo said in this issue #313 even if I don't use streamlit.
pip install imageio-ffmpeg pyav