GRAAL-Research / deepparse

Deepparse is a state-of-the-art library for parsing multinational street addresses using deep learning

Home Page:https://deepparse.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

IndexError: index 12 is out of bounds for dimension 0 with size 12

rbhatia46 opened this issue · comments

Hi,
I am trying to train DeepParse with my custom dataset, I formatted my dataset in the similar formatted expected by DeepParse but while calling retraining, after running for some iterations, I encounter this error -

Epoch: 1/5 Step:  867/4500  19.27% |███▊                |ETA: 10m44.38s loss: 26.601751 accuracy: 27.380953

---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

<ipython-input-13-fe5043b3f70f> in <module>()
     86                        prediction_tags=tag_dictionary,
     87                        logging_path=logging_path,
---> 88                        seq2seq_params=seq2seq_params)
     89 
     90 # Now let's test our fine-tuned model using the best checkpoint (default parameter).

7 frames

/usr/local/lib/python3.7/dist-packages/deepparse/parser/address_parser.py in retrain(self, dataset_container, train_ratio, batch_size, epochs, num_workers, learning_rate, callbacks, seed, logging_path, prediction_tags, seq2seq_params)
    447                               callbacks=callbacks,
    448                               verbose=self.verbose,
--> 449                               disable_tensorboard=True)  # to remove tensorboard automatic logging
    450 
    451         file_path = os.path.join(logging_path, f"retrained_{self.model_type}_address_parser.ckpt")

/usr/local/lib/python3.7/dist-packages/poutyne/framework/experiment.py in train(self, train_generator, valid_generator, **kwargs)
    528             List of dict containing the history of each epoch.
    529         """
--> 530         return self._train(self.model.fit_generator, train_generator, valid_generator, **kwargs)
    531 
    532     def train_dataset(self, train_dataset, valid_dataset=None, **kwargs) -> List[Dict]:

/usr/local/lib/python3.7/dist-packages/poutyne/framework/experiment.py in _train(self, training_func, callbacks, lr_schedulers, keep_only_last_best, save_every_epoch, disable_tensorboard, seed, *args, **kwargs)
    677 
    678         try:
--> 679             return training_func(*args, initial_epoch=initial_epoch, callbacks=expt_callbacks, **kwargs)
    680         finally:
    681             if self.logging:

/usr/local/lib/python3.7/dist-packages/poutyne/framework/model.py in fit_generator(self, train_generator, valid_generator, epochs, steps_per_epoch, validation_steps, batches_per_step, initial_epoch, verbose, progress_options, callbacks)
    557             self._fit_generator_n_batches_per_step(epoch_iterator, callback_list, batches_per_step)
    558         else:
--> 559             self._fit_generator_one_batch_per_step(epoch_iterator, callback_list)
    560 
    561         return epoch_iterator.epoch_logs

/usr/local/lib/python3.7/dist-packages/poutyne/framework/model.py in _fit_generator_one_batch_per_step(self, epoch_iterator, callback_list)
    629             with self._set_training_mode(True):
    630                 for step, (x, y) in train_step_iterator:
--> 631                     step.loss, step.metrics, _ = self._fit_batch(x, y, callback=callback_list, step=step.number)
    632                     step.size = self.get_batch_size(x, y)
    633 

/usr/local/lib/python3.7/dist-packages/poutyne/framework/model.py in _fit_batch(self, x, y, callback, step, return_pred)
    651 
    652         loss_tensor, metrics, pred_y = self._compute_loss_and_metrics(
--> 653             x, y, return_loss_tensor=True, return_pred=return_pred
    654         )
    655 

/usr/local/lib/python3.7/dist-packages/poutyne/framework/model.py in _compute_loss_and_metrics(self, x, y, return_loss_tensor, return_pred)
   1368         else:
   1369             pred_y = self.network(*x)
-> 1370         loss = self.loss_function(pred_y, y)
   1371         if not return_loss_tensor:
   1372             loss = float(loss)

/usr/local/lib/python3.7/dist-packages/deepparse/metrics/nll_loss.py in nll_loss(pred, ground_truth)
     12     ground_truth = ground_truth.transpose(0, 1)
     13     try:
---> 14       for i in range(pred.size(0)):
     15           loss += criterion(pred[i], ground_truth[i])
     16     except:

IndexError: index 12 is out of bounds for dimension 0 with size 12

Any idea what's going wrong here?