graviraja / MLOps-Basics

MLOps-Basics/week_0_project_setup/model.py

Lines 25 to 28 in 403ce8d

    
           def training_step(self, batch, batch_idx): 
        
               logits = self.forward(batch["input_ids"], batch["attention_mask"]) 
        
               loss = F.cross_entropy(logits, batch["label"]) 
        
               self.log("train_loss", loss, prog_bar=True)

here, the loss is not returned, is the model even training?

@rohitgr7 we are logging in to the logger. No need to return the loss unless you want perform some operation on the overall loss in an epoch. I have done that in theweek1 for validation step. Refer here: https://github.com/graviraja/MLOps-Basics/blob/main/week_1_wandb_logging/model.py. If you are returning the loss you can access it in training_epoch_end method.

@graviraja I checked it doesn't look in logged_metrics to check for loss and perform backprop. Getting this warning when nothing is returned from training_step: training_step returned None. If this was on purpose, ignore this warning...

Also in the docs it's mentioned that if nothing is returned then it will skip the corresponding training_step: https://pytorch-lightning.readthedocs.io/en/latest/common/lightning_module.html#training-step

A minimal example to reproduce: https://colab.research.google.com/drive/11qA_1RxcEcHkiY-Xn5EsOR8ZH0wG8O1j#scrollTo=AAtq1hwSmjKe

Fixed it. Thank you @rohitgr7

	def training_step(self, batch, batch_idx):
	logits = self.forward(batch["input_ids"], batch["attention_mask"])
	loss = F.cross_entropy(logits, batch["label"])
	self.log("train_loss", loss, prog_bar=True)

Is training happening?