ModuleNotFoundError: No module named 'tinycudann'
kerrigenwan opened this issue · comments
kerrigenwan commented
Can someone help me take a look at the errors I have encountered?Here are the error codes I encountered.
Training with 1 GPUs.
Using random seed 0
Make folder logs/example_group/example_name
- checkpoint:
- save_epoch: 9999999999
- save_iter: 20000
- save_latest_iter: 9999999999
- save_period: 9999999999
- strict_resume: True
- cudnn:
- benchmark: True
- deterministic: False
- data:
- name: dummy
- num_images: None
- num_workers: 4
- preload: True
- readjust:
- center: [0.0, 0.0, 0.0]
- scale: 1.0
- root: datasets/lego_ds2
- train:
- batch_size: 2
- image_size: [802, 802]
- subset: None
- type: projects.neuralangelo.data
- use_multi_epoch_loader: True
- val:
- batch_size: 2
- image_size: [300, 300]
- max_viz_samples: 16
- subset: 4
- image_save_iter: 9999999999
- inference_args:
- local_rank: 0
- logdir: logs/example_group/example_name
- logging_iter: 9999999999999
- max_epoch: 9999999999
- max_iter: 500000
- metrics_epoch: None
- metrics_iter: None
- model:
- appear_embed:
- dim: 8
- enabled: False
- background:
- enabled: True
- encoding:
- levels: 10
- type: fourier
- encoding_view:
- levels: 3
- type: spherical
- mlp:
- activ: relu
- activ_density: softplus
- activ_density_params:
- activ_params:
- hidden_dim: 256
- hidden_dim_rgb: 128
- num_layers: 8
- num_layers_rgb: 2
- skip: [4]
- skip_rgb: []
- view_dep: True
- white: False
- object:
- rgb:
- encoding_view:
- levels: 3
- type: spherical
- mlp:
- activ: relu_
- activ_params:
- hidden_dim: 256
- num_layers: 4
- skip: []
- weight_norm: True
- mode: idr
- encoding_view:
- s_var:
- anneal_end: 0.1
- init_val: 3.0
- sdf:
- encoding:
- coarse2fine:
- enabled: True
- init_active_level: 4
- step: 5000
- hashgrid:
- dict_size: 22
- dim: 8
- max_logres: 11
- min_logres: 5
- range: [-2, 2]
- levels: 16
- type: hashgrid
- coarse2fine:
- gradient:
- mode: numerical
- taps: 4
- mlp:
- activ: softplus
- activ_params:
- beta: 100
- geometric_init: True
- hidden_dim: 256
- inside_out: False
- num_layers: 1
- out_bias: 0.5
- skip: []
- weight_norm: True
- encoding:
- rgb:
- render:
- num_sample_hierarchy: 4
- num_samples:
- background: 32
- coarse: 64
- fine: 16
- rand_rays: 512
- stratified: True
- type: projects.neuralangelo.model
- appear_embed:
- nvtx_profile: False
- optim:
- fused_opt: False
- params:
- lr: 0.001
- weight_decay: 0.01
- sched:
- gamma: 10.0
- iteration_mode: True
- step_size: 9999999999
- two_steps: [300000, 400000]
- type: two_steps_with_warmup
- warm_up_end: 5000
- type: AdamW
- pretrained_weight: None
- source_filename: projects/neuralangelo/configs/custom/lego.yaml
- speed_benchmark: False
- test_data:
- name: dummy
- num_workers: 0
- test:
- batch_size: 1
- is_lmdb: False
- roots: None
- type: imaginaire.datasets.images
- timeout_period: 9999999
- trainer:
- amp_config:
- backoff_factor: 0.5
- enabled: False
- growth_factor: 2.0
- growth_interval: 2000
- init_scale: 65536.0
- ddp_config:
- find_unused_parameters: False
- static_graph: True
- depth_vis_scale: 0.5
- ema_config:
- beta: 0.9999
- enabled: False
- load_ema_checkpoint: False
- start_iteration: 0
- grad_accum_iter: 1
- image_to_tensorboard: False
- init:
- gain: None
- type: none
- loss_weight:
- curvature: 0.0005
- eikonal: 0.1
- render: 1.0
- type: projects.neuralangelo.trainer
- amp_config:
- validation_iter: 5000
- wandb_image_iter: 10000
- wandb_scalar_iter: 100
cudnn benchmark: True
cudnn deterministic: False
Setup trainer.
Using random seed 0
Traceback (most recent call last):
File "train.py", line 104, in
main()
File "train.py", line 79, in main
trainer = get_trainer(cfg, is_inference=False, seed=args.seed)
File "/home/intel/neuralangelo/imaginaire/trainers/utils/get_trainer.py", line 32, in get_trainer
trainer = trainer_lib.Trainer(cfg, is_inference=is_inference, seed=seed)
File "/home/intel/neuralangelo/projects/neuralangelo/trainer.py", line 26, in init
super().init(cfg, is_inference=is_inference, seed=seed)
File "/home/intel/neuralangelo/projects/nerf/trainers/base.py", line 28, in init
super().init(cfg, is_inference=is_inference, seed=seed)
File "/home/intel/neuralangelo/imaginaire/trainers/base.py", line 50, in init
self.model = self.setup_model(cfg, seed=seed)
File "/home/intel/neuralangelo/imaginaire/trainers/base.py", line 116, in setup_model
lib_model = importlib.import_module(cfg.model.type)
File "/home/intel/miniconda3/envs/neuralangelo/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 843, in exec_module
File "", line 219, in _call_with_frames_removed
File "/home/intel/neuralangelo/projects/neuralangelo/model.py", line 21, in
from projects.neuralangelo.utils.modules import NeuralSDF, NeuralRGB, BackgroundNeRF
File "/home/intel/neuralangelo/projects/neuralangelo/utils/modules.py", line 16, in
import tinycudann as tcnn
ModuleNotFoundError: No module named 'tinycudann'
[2024-03-25 20:11:48,544] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 37238) of binary: /home/intel/miniconda3/envs/neuralangelo/bin/python
Traceback (most recent call last):
File "/home/intel/miniconda3/envs/neuralangelo/bin/torchrun", line 10, in
sys.exit(main())
File "/home/intel/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/intel/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/run.py", line 806, in main
run(args)
File "/home/intel/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/run.py", line 797, in run
elastic_launch(
File "/home/intel/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/intel/miniconda3/envs/neuralangelo/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
train.py FAILED
Failures:
<NO_OTHER_FAILURES>
Root Cause (first observed failure):
[0]:
time : 2024-03-25_20:11:48
host : intel-MD72-HB1-00
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 37238)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html