AssertionError: Pytorch Issue with prediction window > 1

Question

AssertionError: Pytorch Issue with prediction window > 1

Rajmehta123 opened this issue 3 years ago · comments

AssertionError if prediction window > 1.

torch==1.4.0

Traceback (most recent call last):

  File "<ipython-input-96-df6f6f907e9b>", line 107, in <module>
    run(vars(args))

  File "<ipython-input-96-df6f6f907e9b>", line 90, in run
    train_iter, test_iter, nb_features = ts.get_loaders(batch_size=config["batch_size"])

  File "/Users/rmehta/fin/AI4Fin/time-series-autoencoder/tsa/dataset.py", line 84, in get_loaders
    train_dataset = self.frame_series(X_train, y_train)

  File "/Users/rmehta/fin/AI4Fin/time-series-autoencoder/tsa/dataset.py", line 70, in frame_series
    return TensorDataset(features_var, y_hist_var, target_var)

  File "/opt/anaconda3/envs/rlf/lib/python3.7/site-packages/torch/utils/data/dataset.py", line 158, in __init__
    assert all(tensors[0].size(0) == tensor.size(0) for tensor in tensors)

AssertionError

Jules Belveze · Answer 1 · Wed Aug 25 2021 14:12:28 GMT+0800 (China Standard Time)

Hey @Rajmehta123 thanks for pointing that out!
I will have a look at it !

Rajmehta123 · Answer 2 · Wed Aug 25 2021 23:32:07 GMT+0800 (China Standard Time)

I have a hacky solution. Not sure if it is robust enough but I tested with multiple prediction windows and the training went well. You are unsqueezing the features, and yhist but not the targets. So when the prediction_window is 1. The shapes are the same. But when >1, there is a mismatch with the shapes of the tensors.
Let me see if I can open a PR for the same. But it needs to be tested extensively.

Jules Belveze · Answer 3 · Thu Aug 26 2021 13:46:08 GMT+0800 (China Standard Time)

Sounds amazing @Rajmehta123 !
I'll review that PR and make some tests :)

Rajmehta123 · Answer 4 · Thu Aug 26 2021 23:36:28 GMT+0800 (China Standard Time)

Hey Jules, I changed the dataset loader frame function to following.

    def frame_series(self, X, y=None):
        """
        Function used to prepare the data for time series prediction
        :param X: set of features
        :param y: targeted value to predict
        :return: TensorDataset
        """
        nb_obs, nb_features = X.shape
        features, target, y_hist = [], [], []

      for i in range(1, nb_obs - self.seq_length - self.prediction_window):
            features.append(torch.FloatTensor(X[i:i + self.seq_length, :]).unsqueeze(0))
            y_hist.append(torch.FloatTensor(y[i:i + self.seq_length ]).unsqueeze(0))
        
        features_var, y_hist_var = torch.cat(features), torch.cat(y_hist)

        if y is not None:
            for i in range(1, nb_obs - self.seq_length - self.prediction_window):
                target.append(torch.FloatTensor(y[i + self.seq_length:i + self.seq_length + self.prediction_window]).unsqueeze(0))
            target_var = torch.cat(target)
            return TensorDataset(features_var, y_hist_var, target_var)

        return TensorDataset(features_var)

Just unsqueezed target as well to have the same dimensions as the feature/yhist. But that deteriorated the accuracy. Hence unsqueezing the targets didn't help. The shape of tensor are not robust. Targets tensor have a different shape with prediction_window > 1.

Jules Belveze · Answer 5 · Fri Aug 27 2021 13:51:21 GMT+0800 (China Standard Time)

Hey @Rajmehta123 thanks for spending time on this! 😃
It's pretty weird that it alters the performance.. Do you mind opening a PR and I'll review/test it ?