Zasder3 / train-CLIP

A PyTorch Lightning solution to training OpenAI's CLIP from scratch.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MisconfigurationException: `train_dataloader` must be implemented to be used with the Lightning Trainer

antitheos opened this issue · comments

i am trying to run train a model using the following command
python train.py --model_name RN50 --folder ArchDaily --batch_size 512 --accelerator cuda

i get the above error:
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/hooks.py", line 485, in train_dataloader
raise MisconfigurationException("train_dataloader must be implemented to be used with the Lightning Trainer")
pytorch_lightning.utilities.exceptions.MisconfigurationException: train_dataloader must be implemented to be used with the Lightning Trainer

grateful for any assistance

and here is full messages log

Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/configuration_validator.py:119: PossibleUserWarning: You defined a validation_step but have no val_dataloader. Skipping val loop.
category=PossibleUserWarning,
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Traceback (most recent call last):
File "train.py", line 31, in
main(args)
File "train.py", line 20, in main
trainer.fit(model, dm)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 701, in fit
self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 654, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 741, in _fit_impl
results = self._run(model, ckpt_path=self.ckpt_path)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1147, in _run
self.strategy.setup(self)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/single_device.py", line 74, in setup
super().setup(trainer)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/strategy.py", line 153, in setup
self.setup_optimizers(trainer)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/strategy.py", line 142, in setup_optimizers
self.lightning_module
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/optimizer.py", line 179, in _init_optimizers_and_lr_schedulers
optim_conf = model.trainer._call_lightning_module_hook("configure_optimizers", pl_module=model)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1549, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/content/train-CLIP/models/wrapper.py", line 146, in configure_optimizers
first_cycle_steps=self.num_training_steps,
File "/content/train-CLIP/models/wrapper.py", line 38, in num_training_steps
dataset = self.train_dataloader()
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/hooks.py", line 485, in train_dataloader
raise MisconfigurationException("train_dataloader must be implemented to be used with the Lightning Trainer")
pytorch_lightning.utilities.exceptions.MisconfigurationException: train_dataloader must be implemented to be used with the Lightning Trainer

I had the same issue. It looks like the num_training_steps function can't access the DataLoader for some reason. I circumvented the problem by explicitly returning the data length and data batch size in the setup function to plug them into num_training_steps.

I soved this problem.
I downgraded the pytorch-lightning version.
my pytorch-lightning version is 1.4.9

you can change wrapper.py. dataset = self.train_dataloader() to dataset = self.trainer.datamodule.train_dataloader()