Project abstract
For the Assignment 4 of CS 224n course, we did a reimplementation of the Bi-Directional attention flow model (BiDAF). We built the architecture from scratch, tuned the network and tried different regularization and out-of-vocabulary han- dling strategies. Eventually, we are able to get F1 score 76.5 and EM 66.3 on test set with our ensemble model of five single models. More info of this project can be found in:
- question-answering-system.pdf
- poster.pdf
Programming Assignment 4
How to train
Remember to change the parameters in code/train.py
. Run your model by:
$ python code/train.py
How to check locally
- python process_glove.py --glove_dir download
- export CUDA_VISIBLE_DEVICES='' python code/qa_answer.py --train_dir train
- python code/evaluate.py data/squad/dev-v1.1.json dev-prediction.json
How to submit:
-
Change the parameters in
code/qa_answer.py
, make sure they're the same as what you used incode/train.py
. You need to specifycontext_maxlen
,question_maxlen
(Cannot be None). -
Make sure your model is runnable by running
$ python code/qa_answer.py
-
Run the submission script by the following command. You'll need to log in to codalab. This script will block until the job is complete.
$ ./codalab_run-predict.sh
-
To submit sanity-check, run the following command. Visit Codalab to see results.
$ cl edit run-predict -T cs224n-win17-submit-sanity-check
-
To submit dev
$ cl edit run-predict -T cs224n-win17-submit-dev
-
To submit test
$ cl edit run-predict -T cs224n-win17-submit-test