JulesBelveze / time-series-autoencoder

PyTorch Dual-Attention LSTM-Autoencoder For Multivariate Time Series

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AssertionError: Pytorch Issue with prediction window > 1

Rajmehta123 opened this issue · comments

AssertionError if prediction window > 1.

torch==1.4.0

Traceback (most recent call last):

  File "<ipython-input-96-df6f6f907e9b>", line 107, in <module>
    run(vars(args))

  File "<ipython-input-96-df6f6f907e9b>", line 90, in run
    train_iter, test_iter, nb_features = ts.get_loaders(batch_size=config["batch_size"])

  File "/Users/rmehta/fin/AI4Fin/time-series-autoencoder/tsa/dataset.py", line 84, in get_loaders
    train_dataset = self.frame_series(X_train, y_train)

  File "/Users/rmehta/fin/AI4Fin/time-series-autoencoder/tsa/dataset.py", line 70, in frame_series
    return TensorDataset(features_var, y_hist_var, target_var)

  File "/opt/anaconda3/envs/rlf/lib/python3.7/site-packages/torch/utils/data/dataset.py", line 158, in __init__
    assert all(tensors[0].size(0) == tensor.size(0) for tensor in tensors)

AssertionError

Hey @Rajmehta123 thanks for pointing that out!
I will have a look at it !

I have a hacky solution. Not sure if it is robust enough but I tested with multiple prediction windows and the training went well. You are unsqueezing the features, and yhist but not the targets. So when the prediction_window is 1. The shapes are the same. But when >1, there is a mismatch with the shapes of the tensors.
Let me see if I can open a PR for the same. But it needs to be tested extensively.

Sounds amazing @Rajmehta123 !
I'll review that PR and make some tests :)

Hey Jules, I changed the dataset loader frame function to following.

    def frame_series(self, X, y=None):
        """
        Function used to prepare the data for time series prediction
        :param X: set of features
        :param y: targeted value to predict
        :return: TensorDataset
        """
        nb_obs, nb_features = X.shape
        features, target, y_hist = [], [], []

      for i in range(1, nb_obs - self.seq_length - self.prediction_window):
            features.append(torch.FloatTensor(X[i:i + self.seq_length, :]).unsqueeze(0))
            y_hist.append(torch.FloatTensor(y[i:i + self.seq_length ]).unsqueeze(0))
        
        features_var, y_hist_var = torch.cat(features), torch.cat(y_hist)

        if y is not None:
            for i in range(1, nb_obs - self.seq_length - self.prediction_window):
                target.append(torch.FloatTensor(y[i + self.seq_length:i + self.seq_length + self.prediction_window]).unsqueeze(0))
            target_var = torch.cat(target)
            return TensorDataset(features_var, y_hist_var, target_var)

        return TensorDataset(features_var)

Just unsqueezed target as well to have the same dimensions as the feature/yhist. But that deteriorated the accuracy. Hence unsqueezing the targets didn't help. The shape of tensor are not robust. Targets tensor have a different shape with prediction_window > 1.

Hey @Rajmehta123 thanks for spending time on this! 😃
It's pretty weird that it alters the performance.. Do you mind opening a PR and I'll review/test it ?