semantics_project
Instructions
The CNN dataset can be downloaded from here.
Global configurations are in config.py
.
To run naive baselines including word-distance and max-frequency (inclusive/exclusive):
python NaiveBaselines.py
To train a (very preliminary) two-layered unidirectional LSTM:
python DeepLSTMReader.py
Generate the vocabulary list if vocab.txt
does not exist:
python gen_vocab.py
Todo
- check whether is model architecture is correct (the current model doesn't seem to converge to a meaningful local minimum, suggesting there might be bugs)
- optimize speed and memory efficiency
- implement some other baselines (e.g. SVM)