EdGENetworks / attention-networks-for-classification

Hierarchical Attention Networks for Document Classification in PyTorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

single lstm for all sentences

opened this issue · comments

According to my understanding, the LSTM trained on different sentence should be different, but according to your model, each sentence has the same LSTM parameters for wordAttnRNN?

The paper does not seem to mention different RNN's trained for different sentences. We force the classifier in the end to learn a shared representation among sequence of sentences and sequence of words. This shared representation helps the model to generalise well to unseen data.

There is nothing stopping you from training different GRU's though, but I think it will be a futile effort.