semantics_project

Instructions

The CNN dataset can be downloaded from here.

Global configurations are in config.py.

To run naive baselines including word-distance and max-frequency (inclusive/exclusive):

python NaiveBaselines.py

To train a (very preliminary) two-layered unidirectional LSTM:

python DeepLSTMReader.py

Generate the vocabulary list if vocab.txt does not exist:

python gen_vocab.py

check whether is model architecture is correct (the current model doesn't seem to converge to a meaningful local minimum, suggesting there might be bugs)
optimize speed and memory efficiency
implement some other baselines (e.g. SVM)